Google today announced a big update to BigQuery, its service for quickly analyzing large amounts of data. The company has added new features that let users work in real-time, query subsets of the latest data, and use more functions.
Before we dig into those, however, it’s worth noting that the browser tool has also received some improvements: you can now query your history faster in the browser tool using the Query History panel. More information about the queries is now surfaced and Google has added buttons for common tasks:
First up, BigQuery is now able to load and analyze data in real time through a simple API call: tabledata().insertAll(). This means you can store data as it comes in (instead of building and maintaining systems to cache and upload in batches) and query it as needed. To use it, just call the new endpoint with your data in a JSON object (works for a single row as well as multiple rows of data).
Google has made streaming data into BigQuery free until January 1, 2014. Starting next year, it will be billed at a flat rate of 1 cent per 10,000 rows inserted (but this is only for real-time, the traditional jobs().insert() method will continue to be free).
Next up, BigQuery has gained a new syntax for “table decorators,” letting you define queries that only scan a range or spot in the previous 24 hours: the last hour of inserted data, what was inserted before that hour, or a snapshot of the table at a specific time. Instead of a “full column scan” when querying data, the new syntax allows you to focus only on a specific subset of the latest data, resulting in lower querying costs. This works with all table related operations (list, export, copy, and so on).
Last but not least, Google has added new window functions SUM(), COUNT(), AVG(), MIN(), MAX(), FIRST_VALUE, and LAST_VALUE() as well as new statistical functions COVAR_POP(), COVAR_SAMP(), STDDEV_POP(), STDDEV_SAMP(), VAR_POP() and VAR_SAMP(). Google is taking your feedback over on Stack Overflow.
Top Image Credit: Pawel Kryj