by Thor Olavsrud

Big Data Suite Goes Open Source

News
Feb 17, 20154 mins
AnalyticsBig DataOpen Source

Pivotal is releasing its data lake architecture components, including Greenplum DB, GemFire and HAWQ, to the open source community.

hand holding tablet with apps flying above
Credit: Thinkstock

Last spring, Pivotal unveiled its Pivotal Big Data Suite, a subscription-based software, support and maintenance package that bundled its big data components into a single, simple licensing structure. The Big Data Suite was responsible for $40 million of the $100 million in total business Pivotal did in 2014. Today, the company took the unprecedented step of open sourcing all those components.

The company released its Pivotal HD Hadoop distribution to open source, along with Pivotal Greenplum Database, Pivotal GemFire real-time distributed data store, Pivotal SQLFire (a SQL layer for the real-time distributed data store), Pivotal GemFire XD (in-memory SQL over HDFS) and Pivotal HAWQ parallel query engine over HDFS.

Breaking Down Enterprise Barriers

Pivotal hasn’t abandoned its ambitions for the Big Data Suite, according to Sundeep Madra, vice president, Data Product Group, Pivotal. Rather, he says, the company will continue to sell it with premium features in a model proven by Pivotal Cloud Foundry. As with Cloud Foundry, he says, releasing all the components to open source reduces enterprise barriers to adoption by easing lock-in fears, while also paving the way for a the creation of a community of development and support around the components.

[ Related: Pivotal Looks to Simplify Building ‘Business Data Lakes’ ]

“Pivotal Big Data Suite is a major milestone in the path to making big data truly accessible to the enterprise,” Madra says. “By sharing Pivotal HD, HAWQ, Greenplum DB and GemFire capabilities with the open source community, we are contributing to the market as a whole the necessary components to build solutions that work for all data needs. Releasing these technologies to be open source projects will only help accelerate adoption and innovation for our customers.”

Madra notes that Pivotal Big Data Suite will provide support for bare metal commodity hardware, applicance-based delivery, virtualized instances as well as public, private and hybrid cloud support. Pivotal Cloud Foundry will be included, providing the ability to use Big Data Suite capabilities in Pivotal Cloud Foundry applications.

[ Related: Review: Cloud Foundry Brings Power and Polish to PaaS ]

Pivotal also announced a number of new application services for Big Data Suite, including the following:

  • Pivotal Big Data Suite on Pivotal Cloud Foundry. This service leverages advanced data services using applications running in the open cloud platform as a service.
  • Spring XD. This service is a highly scalable open source distributed framework for data ingestion, batch processing and analytic pipeline management.
  • Redis. This service is a scalable open source key-value store and data structure server.
  • RabbitMQ. This service is a scalable open source reliable message queue for applications.

Pivotal also announced a new strategic partnership with Hadoop distribution vendor Hortonworks, which marries Pivotal’s SQL on Hadoop, analytical database and NoSQL in-memory technologies with Hortonworks’ expertise and support for Hadoop.

The partnership is designed to achieve the following:

  • Pivotal and Hortonworks will align engineering teams to accelerate the enterprise capabilities of Apache Hadoop with Pivotal technologies including Pivotal HAWQ and GemFire.
  • Pivotal and Hortonworks intend to provide all available advanced services in the Big Data Suite on the Hortonworks Data Platform. They will start with HAWQ.
  • Hortonworks will provide escalation level support for Pivotal HD 3.0, Pivotal’s Hadoop distribution.

“Pivotal and Hortonworks are both shaping a new data infrastructure through open source technology,” Madra says. “Our partnership with Hortonworks maintains the speed and flexibility of open source, while bringing world-class, modern data analytics and services vital to enterprises. Most importantly, the move away from fragmentation to a unified approach for a common Hadoop core will help remove key barriers to the widespread enterprise adoption of Hadoop.”

“We’re really excited to see technologies like HAWQ and GemFire become open source,” adds Shaun Connolly, vice president of Corporate Strategy, Hortonworks. “We have customers that are deploying data lakes and they want the value and capabilities of HAWQ. I’m committed to making that happen.”