This article was published on March 27, 2014

Facebook, Google, LinkedIn, and Twitter launch WebScaleSQL, a custom version of MySQL for massive databases


Facebook, Google, LinkedIn, and Twitter launch WebScaleSQL, a custom version of MySQL for massive databases

Facebook, Google, LinkedIn, and Twitter today announced WebScaleSQL, a collaborative project bringing the engineers from these companies together to solve challenges working with massive databases. As its name suggests, WebScaleSQL is a custom version of MySQL designed for large scale Web companies.

The four companies will share a common set of changes to the upstream MySQL branch, available via open source. The project will include contributions from MySQL engineering teams at all four companies, and since it will be open, others who have the scale and resources to customize MySQL will be able to join in the efforts and provide input as well.

In a blog post, Facebook revealed what the engineers involved in WebScaleSQL have so far changed to aid in the development of the new branch:

  • An automated framework that will, for each proposed change, run and publish the results of MySQL’s built-in test system (mtr).
  • A full new suite of stress tests and a prototype automated performance testing system.
  • Several changes to the tests already found in MySQL, and to the structure of some existing code, to avoid problems where otherwise safe code changes had previously caused tests to fail or caused unnecessary conflicts. These changes make it easier to work on the code and helped us get started creating WebScaleSQL.
  • Several changes to improve the performance of WebScaleSQL, including buffer pool flushing improvements (links here and here); optimizations to certain types of queries; support for NUMA interleave policy; and more.
  • New features that make operating WebScaleSQL at true web scale easier, such as super_read_only, and the ability to specify sub-second client timeouts.

Facebook also revealed what its own WebScaleSQL team is currently working on. Those include an asynchronous MySQL client (links here and here) so that there’s no need to connect, send, or retrieve while querying MySQL, and adding its Logical Read-Ahead mechanism for speed improvements up to 10x when doing full table scans. The company is also preparing to move its production-tested versions of table, user, and compression statistics into WebScaleSQL as well as push its remaining components of current production-tested version of compression into WebScaleSQL.

WebScaleSQL on GitHub

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with