Apache Spark is now available as a service on the IBM Bluemix cloud platform. At IBM Insight 2015 in Las Vegas Monday afternoon, Big Blue also announced that it has redesigned more than 15 core analytics and commerce solutions with Spark.
"It's been an incredible success since we put this in beta back in June," Rob Thomas, vice president of Product Development at IBM Analytics, says of IBM Analytics on Apache Spark, the new Spark-as-a-service offering. "We've had more than 5,000 developers coming in and building applications with it."
The open source Spark processing engine, first developed by the AMPLab at UC Berkeley, was designed for data science uses. But it's in-memory processing capabilities mean that its performance can exceed that of the MapReduce engine by up to 100x for some applications, especially applications involving machine learning.
[ Related: Review: IBM Bluemix Bulks Up Cloud Foundry ]
Thomas says Spark has allowed it to simplify the architecture of many of its software solutions and cloud data services, including IBM BigInsights, IBM Streams and IBM SPSS. Big Blue was able to reduce the code base of its DataWorks data preparation and data refinement service by more than 87 percent — it reduced DataWorks footprint from 40 million lines of code to million lines of code. That translates into simplified operations and dramatically reduced build and deployment times, Thomas says.
"For data scientists and engineers who want to do more with their data, the power and appeal of open source innovation for technologies like Spark is undeniable," Thomas says. "IBM is committed to using Spark as the foundation for its industry-leading analytics platform, and by offering a fully managed Spark service on IBM Bluemix, data professionals can access and analyze their data faster than ever before, with significantly reduced complexity."
By making Spark available as a service on Bluemix, Thomas says developers will be able to infuse their apps with real-time analytics that can integrate with open source, proprietary and third-party tools on the Bluemix cloud platform.
Glen Lavigne, president and CEO of Nova Scotia-based SolutionInc, says the service has already made it possible to present SolutionInc customers with better service. SolutionInc provides managed, high-demand, public Wi-Fi and wired access in hotels, conference centers and hotspots across 50 countries. It needs to analyze Wi-Fi data from multiple sources to identify traffic patterns and trends, including peak volume times, busiest locations, route patterns and device types.
"With IBM Spark technology, we were able to explore over 240 million rows of Wi-Fi log information and identify device traffic patterns and data across multiple locations," Lavigne says. "These analytics are enabling us to better understand market demands and trends and provide a better service to our customers."