Open source Hadoop distribution specialist Hortonworks wants to close the loop on predictive analytics, allowing it to turn what it calls the "Internet of Anything" into actionable insights. To get there, it announced today that it has signed a definitive agreement to acquire Onyara, creator and key contributor to the top-level Apache NiFi open source project.
NiFi was born eight years ago as Niagarafiles, a National Security Agency (NSA) project for automating data flows among multiple computer networks, even when data formats and protocols differ. The agency released NiFi to open source via the Apache Software Foundation late last year as part of the NSA Technology Transfer Program. Onyara was founded last December by the engineers who were the key contributors Niagarafiles, and opened for business in March of 2015. NiFi became a top-level Apache project in July 2015.
Hortonworks DataFlow aims to make data actionable
Shaun Connolly, vice president of Corporate Strategy at Hortonworks, says the acquisition is expected to close in the third quarter of this year. Onyara will be folded into Hortonworks, with the Onyara engineers forming the core of the team behind a new Hortonworks product: Hortonworks DataFlow.
[ Related: Hortonworks to speed Hadoop cloud deployments ]
"It's really about getting a solution in place that enables us to capture all the Internet of Anything data and turn that into actionable insights," Connolly says.
"If you look at a lot of the definitions of the Internet of Things, people associate it with just sensors and machine data," he adds. "The Internet of Anything in our parlance also includes clickstream data and social stream data. It also comes from people and their interactions; it encompasses all data from any thing, person, machine or what have you."
Hortonworks DataFlow is intended to make it easy for customers to automate and secure data flows and to collect, conduct and curate real-time business insights and actions derived from data in motion. Hortonworks DataFlow will be a separate product from Hortonworks Data Platform (HDP) and won't require HDP — or even Hadoop — though Connolly notes the combination of Hortonworks DataFlow and HDP will allow users to bring together streaming data and historical data into a cohesive whole to create predictive analytics applications.
"The combination of streaming analytics and rich historical analytics, you need to combine them both to have a real-time predictive application — a predictive application that may go back to the edge device and impact how it's behaving," he says.
Internet of Anything applications are driven by data flowing from machines, sensors, geo-location devices, social streams, clicks, logs and more from the edge to the data lake in real-time at full fidelity. Many of these applications need two-way connections and security from the edge to the data center. And beyond security, the "jagged edge" of the Internet of Anything also increases the need for data protection, governance and provenance. Connolly says Hortonworks DataFlow will simplify and accelerate the flow of data in motion into HDP for full fidelity analytics. Organizations could also flow the data into Apache Kafka and then into Apache Storm or Apache Spark for streaming analytics.
"The NiFi user interface and ease of extension have made it extremely easy to get up and running and even customize," says Craig Connell, chief technology officer of Leverege, a developer of software products for intelligently managing and visualizing large networks of diverse Internet of Things sensors. "It is great that is also easily integrates with other parts of the Apache big data world like Spark, Kafka and Hadoop."
"NiFi's well-designed, mature API has made our integration process remarkably straightforward," adds Mike Bishop, chief systems architect at Prescient Edge, a security integration and technology development firm. "With it, we're able to track the origin, transformation and persistence of data throughout our analytic processes."
100 percent open
Hortonworks DataFlow will be available as an additional subscription from Hortonworks, alongside its Hortonworks Data Platform Enterprise and Hortonworks Data Platform Enterprise Plus subscriptions. Hortonworks DataFlow itself will be 100 percent open source, Connolly says. The subscription will cover maintenance, support and advice on the best ways to deploy the architecture. Connolly notes it will also allow Hortonworks to engage ISVs and systems integrators interested in the Internet of Things.
The Onyara deal marks Hortonworks third major acquisition. It acquired data security specialist XA Secure in May 2014 and released its technology to the Apache Software Foundation as the Apache Ranger project. It followed that deal in April 2015 with the acquisition of Budapest, Hungary-based product and services company SequenceIQ, developer of the Cloudbreak Hadoop as a Service API for multi-tenant clusters and Periscopefor bringing policy-based autoscaling to Hadoop.