by Thor Olavsrud

Hortonworks release cadence balances innovation with reliable Hadoop core

Mar 02, 2016
AnalyticsBig DataHadoop

The Hadoop distribution vendor will update core Apache Hadoop components once a year, while continually updating services that run on top of Hadoop.

Hortonworks today released the newest version of its Hortonworks Data Platform (HDP) and Hortonworks DataFlow (HDF), and aired a new distribution strategy that will provide enterprise customers with a more stable Apache Hadoop core, while increasing the release cadence of extended services that run on top of the core.

[ Related: How different SQL-on-Hadoop engines satisfy BI workloads ]

“This is a defining moment for how we deliver advancements to our customers,” says Tim Hall, vice president of product management at Hortonworks. “We can give customers all the latest innovations in the moment without sacrificing a stable and reliable core. This will change the way people consume Hadoop.”

Beginning today, the distribution strategy for HDP will follow to different release cadences:

  • Core Apache Hadoop components (HDFS, MapReduce and YARN) and Apache Zookeeper (which provides Hadoop with a distributed configuration service, synchronization service and naming registry) will be updated annually. Those components will be aligned with the efforts of the ODPi consortium to create a common reference platform for Apache Hadoop and related big data technologies.
  • Extended Services (including Spark, Hive, HBase, Ambari and more) which run on top of the core will be logically grouped together and released continually throughout the year to match the pace of innovation occurring within each project team in the community.

To underscore its commitment to the new distribution strategy, Hortonworks on Tuesday announced the general availability of Apache Spark 1.6, Apache Ambari 2.2 and SmartSense 1.2 in HDP 2.4.

HDF joined with HDP will drive actionable insights

In addition, Hortonworks DataFlow (HDF) 1.2, its data-in-motion platform for real-time streaming of data, will be available in the first quarter of this year. The new release improves support for streaming analytics with the addition of support for Apache Kafka and Apache Storm.

By bringing together HDP for data-at-rest and HDF for data-in-motion into an integrated whole, Hortonworks is seeking to provide what it calls its Connected Data Platforms. The idea is to close the loop on predictive analytics, allowing it to turn what it calls the “Internet of Anything” into actionable insights.

HDF is intended to make it easy for customers to automate and secure data flows and to collect, conduct and curate real-time business insights and actions derived from data in motion. Combined with HDP, it allows users to bring together streaming data and historical data into a cohesive whole to create predictive analytics applications.

[ Related: 21 data and analytics trends that will dominate 2016 ]

“One of the reasons we chose Hortonworks was because its Connected Data Platforms are 100 percent open, allowing us to stay tightly aligned with the latest innovations coming from the open community,” says Jan Rock, senior manager at Hortonworks customer Royal Mail Group. “Together, HDP for data-at-rest and HDF for data-in-motion are exactly what our business needs to drive actionable insights.”