Facebook open-sources Presto, a homegrown SQL query engine for mining its enormous data warehouses

Facebook open-sources Presto, a homegrown SQL query engine for mining its enormous data warehouses

Facebook is open-sourcing Presto, an SQL query engine that it developed in-house to help analysts, data scientists and engineers pick apart the information stored in its enormous data warehouses.

Development for Presto began in the fall of 2012 and was then released to all Facebook employees last spring. The system is now used by over 1,000 employees, running over 30,000 queries that include at least one petabyte of data on a daily basis. Facebook says it’s “ten times better” than alternatives such as Hive and Mad*Reduce in regards to CPU efficiency and latency for the majority of queries submitted by its employees.

So. Much. Tech.

Some of the biggest names in tech are coming to TNW Conference in Amsterdam this May.

“It currently supports a large subset of ANSI SQL, including joins, left/right outer joins, subqueries, and most of the common aggregate and scalar functions, including approximate distinct counts (using HyperLogLog) and approximate percentiles (based on quantile digest),” Martin Traverso, a software engineer at Facebook said.

You can check out the code and documentation for Presto using the links below:

➀ Documentation | Code

Image Search: Ed Jones/AFP/GettyImages

Read next: Adblock Plus now lets you remove photo and video previews from Twitter's website

Shh. Here's some distraction

Comments