While most of the talk around big data assumes such systems will be deployed in-house, Google is building a service that will allow for analyzing large amounts of data in the cloud.
The service, called BigQuery, would help organizations analyze their data without the need for building infrastructure, explained Google product manager Ju-Kay Kwek, speaking at the GigaOm Structure Data conference being held this week in New York.
With BigQuery, "you can build applications and share data all as a service," Kwek said. Google now offers the service in a limited preview mode for a handful of customers. Kwek did not say when the service would be fully available.
Google executives saw a natural opportunity to offer a query service because they already had developed some of the needed tools for internal use. "Indexing the Web is a big data problem to begin with," Kwek said. The company also does a lot of analysis of how people use its services such as Gmail. Operational data can provide glimpses into what features are or are not working.
Google's key to successful analysis is to keep all the data it generates. "The fine-grain data is key," Kwek said. In many cases, Google engineers do not know which questions to ask beforehand when it comes to querying its data. The key is to ask lots of questions. "The more questions you can quickly ask, the smarter those questions get," he said.
The trial users are testing the service in various ways. Customers will upload a set of data and then stream updates to the data set as they become available. They then can use Google's algorithms and query language to parse the data sets. Online ad providers use the service to collect a range of data about what actions users are doing and collect metrics to see how successful their ads are doing. Another customer, a hotel, uses the data to manage revenue data, mashing up information from different financial systems to make fiscal projections.
The cloud model for big data offers a number of benefits, Kwek said. Most notably, it eliminates the need for an organization to provision and set up a data warehouse.
"We take care of all the plumbing," Kwek said.
Security and data back-up are included in service. As a result, using a service could "reduce the time to insight," Kwek said.
One user is We Are Cloud, a company based in Montpellier, France, that offers business intelligence capabilities as a cloud service. The service, called Bime, is marketed at small and medium-sized businesses that may not have the staff to maintain an in-house business intelligence system. The service's dashboard can draw information from multiple sources, such as in-house relational databases. Customers "want tools to become self-aware," said Rachel Delacour, CEO and co-founder of We Are Cloud.
The company became a beta user, allowing it to offer BigQuery to its customers. One customer, an unnamed telecommunications company located in the Middle East, uploaded 15 terabytes of customer data that it could then analyze using Bime. Delacour said she could not reveal how much Google charges to host the data, but says the rate has been quite reasonable.
"It makes sense in a dynamic environment. It gives you near [real-time] response on top of very large data sets," Delacour said of BigQuery, in an interview following her presentation.