by Thor Olavsrud

MapR brings down data silos with converged data platform

News Analysis
Dec 08, 2015
Big DataHadoopOpen Source

The Hadoop distribution specialist has announced MapR Streams, which will combine with its Hadoop distribution and NoSQL database to integrate file, database, stream processing and analytics to support new data-driven applications.

Somehow, efforts to tear down silos of data within organizations over the past decade or so have led to the creation of entirely new data silos due to the scattershot proliferation of new analytics tools and the consumerization of enterprise software.

On Tuesday, Hadoop distribution specialist MapR Technologies announced its intention to change all that with the MapR Converged Data Platform, which integrates file, database, stream processing and analytics. By natively integrating data-in-motion and data-at-rest in a converged platform, MapR says it will enable developers to create new, innovative applications that reduce data duplication and movement, lower the cost of integration and maintenance associated with multiple platforms and accelerate business results.

titled mapr converged data platform final 12 3 15

The Converged Data Platform brings together the MapR Distribution including Apache Hadoop, MapR-DB and MapR Streams. MapR Streams, also announced Tuesday, is a new global event stream system, natively integrated with MapR’s Hadoop distribution, which allows organizations to continuously collect, analyze and act on streaming data.

“Bringing together world-class Apache Hadoop and Apache Spark with a top-ranked NoSQL database and now continuous reliable streaming with global scale is a huge step forward in enabling enterprise developers to create the next-gen apps using big data,” Anil Gadre, senior vice president, product management, MapR Technologies, said in a statement Tuesday.

MapR Chief Marketing Officer Jack Norris notes that MapR Streams can easily scale to handle massive data flows and long-term persistence, while also providing enterprise features like high availability, disaster recovery, security and full data protection.

“This is moving from a batch environment to incorporating the analytics into the production data flow so you can impact business while it’s happening,” Norris says.

For instance, Norris says, the Converged Data Platform will help advertisers provide relevant real-time offers, healthcare providers improve personalized treatment, retailers optimize their inventory and telecom carriers dynamically adjust mobile service areas.

MapR Streams allows developers to:

  • Build scalable, continuous high-throughput streams across thousands of locations with millions of topics and billions of messages
  • Unite analytics, transaction and stream processing to reduce data duplication, latency and cluster sprawl, while using existing open source projects like Spark Streaming, Apache Storm, Apache Flink and Apache Apex
  • Enable reliable message delivery with auto-failover and order consistency
  • Ensure cross-site replication to build global real-time applications
  • Provide unlimited persistence of all messages in a stream

Norris says MapR Streams will be generally available in early 2016.