MemSQL paves a smoother path to Spark for real-time analytics

Its new tool lets companies tap Spark without writing any code

memsql spark streamliner real time data pipeline

MemSQL's Spark Streamliner

Credit: MemSQL

Now that companies are recognizing the benefits of analytics and big data, the next step is putting those benefits within closer reach. Toward that end, MemSQL on Thursday unveiled a new tool designed to help companies tap Apache Spark without writing any code.

Spark Streamliner is a tool that integrates MemSQL's in-memory database and Apache Spark's in-memory data-processing framework for streaming data from real-time sources such as sensors, Internet-of-Things (IoT) devices, transactions, applications and logs.

Offering "one click" deployment of integrated Spark along with a Web-based interface, it allows users to create multiple data pipelines in minutes, perform custom transformations in real time and develop new analytics applications, MemSQL said.

Hooked up with a real-time data source like Apache Kafka, Spark Streamliner supports thousands of concurrent users running real-time analytical queries. Data is streamed directly into MemSQL. There's no need to extract, transform and load (ETL) data in batch fashion; rather, users can process data as it streams in, thereby eliminating analytic latency.

Featuring a simple SQL interface, Spark Streamliner can easily be connected to popular analytical tools, MemSQL said. Users can also share a single resource pool for multiple pipelines, effectively reducing their total hardware footprint.

A video demonstrates MemSQL Spark Streamliner in action. The open source tool and a library of example extractors and transformers are now available on GitHub.

To comment on this article and other CIO content, visit us on Facebook, LinkedIn or Twitter.
Related:
Download the CIO October 2016 Digital Magazine
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.