by Thor Olavsrud

Pentaho adds Amazon EMR, SAP HANA to data integration platform

Jun 09, 2015
AnalyticsBig DataTechnology Industry

The data integration specialist, recently acquired by Hitachi Data Systems (HDS), updates its platform offer support for Amazon EMR and SAP HANA.

data integration hands cooperation
Credit: Thinkstock

On the heels of Hitachi Data Systems (HDS) closing on its acquisition of data integration specialist Pentaho last week, Pentaho today released an update to its platform that adds integration for Amazon Elastic MapReduce (EMR) and SAP HANA.

“It’s really all about future-proofing big data environments,” says Chuck Yarbrough, director of big data marketing at Pentaho. “As people continue to invest in being a data-driven enterprise and building out big data infrastructure, we pride ourselves on being able to future-proof these investments.”

[ Related: Pentaho adds orchestration for Apache Spark jobs ]

Yarbrough says that the Pentaho 5.4 release focuses on three themes:

  • Deploying big data in the cloud
  • Being able to blend all data across the enterprise
  • Providing comfort and confidence to the customer about growing and scaling their Hadoop environment
15 037 pentaho 5.4 social image v5 Pentaho 5.4

New capabilities in Pentaho 5.4. (Click for larger image.)

Pentaho 5.4 allows customers to use Amazon EMR to natively transform and orchestrate data, and design and run Hadoop MapReduce in-cluster on EMR. That, in turn, gives organizations new options for how they can operationalize a cloud-based data refinery architecture for on-demand governed delivery of data sets.

[Related: HDS adds to advanced analytics portfolio with Pentaho buy ]

“We’ve supported cloud deployment of Hadoop in the past,” Yarbrough says. “But now we’ve opened up the full ability to support an entire Amazon AWS instance. You can now push your data into EMR and then process that data at scale inside Hadoop with Pentaho Data Integration.”

Pentaho 5.4 also adds an interface from Pentaho Data Integration (PDI) into SAP HANA at the request of a number of its larger customers, as well as Hitachi. The integration enables governed data delivery across multiple structured and unstructured sources.

Big data at scale

Along with support for integration with Amazon EMR and SAP HANA, the Pentaho 5.4 release adds capabilities around big data orchestration and analytics at scale, all based on Pentaho’s Big Data Blueprints use case designs. The new capabilities include the following:

  • Integration of PDI with Apache Spark, enabling orchestration of Spark jobs
  • New APIs to simplify embedding of analytics into business applications and processes
  • The capability to localize Pentaho in French, German and Japanese

Pentaho 5.4 is immediately available. Yarbrough says he expects the 6.0 release to come near the end of the year.

Follow Thor on Google+