In today’s blog post I would like to introduce one of the most interesting services provided by Google which
is useful in creating our applications. Google BigQuery was launched in 2010. The service is a tool helping
analyze large volumes of data.
At first glance Google BigQuery does not seem to be revolutionary. Its operation resembles non-relational
database to which we can upload our data or retrieve pieces of needed data. However, there are some
features that differentiate this service. Its main advantages are:
fast processing of big data
API libraries in many languages and frameworks
- query result cache
Using BigQuery in our projects has given us many possibilities. We are able to create applications that
require applying huge volumes of data. Well-written API of the service causes making a programme much
easier and the programme itself is highly effective.
Our first confrontation with BigQuery was an application processing the data from CSV files uploaded by
the users. The number of the data and the users is considerable as well as the file size. The loading of the
data into BigQuery improved browsing the files. The application during the first stage of creating was
integrated with non-relational database MongoDB. This solution was less effective. Cache usage in
BigQuery and the optimization of SQL queries caused an incompatibility of the processing time in both
solutions in favour of Google. Particularly good results are noticeable with big data.
On the official website of the project are presented the exemplary data that allow to examine the
effectiveness of the solution. The time search of 150 000 elements from a set of data is shown below
publicdata:samples/wikipedia and also the time of re-accessing to the set by means of caching.