As big data applications move from development and proof of concept to production, the need for management and operations tools is becoming ever more pronounced. Enter big data application platform company, Concurrent, primary sponsor of the open source Cascading application framework.
Concurrent today announced Driven, which the company says is the first application performance management product for big data applications. Chris Wensel, author of the Cascading project and founder and CTO of Concurrent, says Driven is purpose-built to address the pain points of enterprise application development and application performance management on Apache Hadoop.
"Driven is a powerful step forward in delivering the full promise of connecting business with big data," Wensel says. "Gone are the days when developers must dig through log files for clues to slow performance or failures of their data processing applications. The release of Driven further enables enterprise users to develop data-oriented applications on Apache Hadoop in a more collaborative, streamlined fashion. Driven is the key to unlock enterprises' ability to drive differentiation through data. There's a lot more to come—this is only the beginning."
Governance and compliance tools are among the features on tap for the future, Wensel says.
The idea behind Driven is to give developers, data analysts, data scientists and operations the capability to see key application metrics in real-time and thus allow them to isolate and resolve problems quickly.
Driven Helps Visualize, Diagnose and Resolve Big Data Failures
Driven is essentially a plug-in for Cascading, which is an application that sits atop Hadoop and works with all the major Hadoop distributions. Once installed, Driven immediately begins collecting telemetry data from your running Cascading applications. That includes users of the popular domain-specific languages (DSL) built on Cascading, including Scalding (Scala DSL on Cascading), Cascalog (Clojure DSL on Cascading), Lingual (ANSI SQL on Cascading) and Pattern (Predictive Model Scoring on Cascading). The telemetry data gives Cascading users the ability to visualize their data applications, diagnose and quickly resolve application failures and performance problems.
Concurrent says Driven will allow developers and enterprises to achieve the following:
- Accelerate time to market. Driven reduces the time to application production with process visualization and monitoring capabilities, allowing you to quickly understand complex applications and data flows by drilling down into each application at runtime using a rich user interface.
- Build reliable applications. With detailed insight into your data processing logic and algorithms, you can ensure they are executing properly. Driven surfaces key application metrics around each data process to provide insights around data accuracy.
- Optimize application performance. Driven allows you to understand the performance and capacity of the applications running on your infrastructure by providing key application behavior metrics like data skew and runtime parallelization. You can also compare this information with historical data to trend application performance, both in development and in production.
Driven represents a first for Concurrent in other ways as well. Until now, all of Concurrent's output has been open source. But Driven is a proprietary product on top of Cascading, Wensel explains. It comes in two flavors: Driven and Driven Enterprise.
Driven, available now in public beta is a free cloud service for development environments only. Concurrent will provide online support for Driven. Driven Enterprise, which Wensel says will be generally available in the second quarter of 2014, will require an annual subscription. It is intended for both development and production environments and will support both developers and operations. Driven Enterprise will be available via on-premise and in the cloud, and Concurrent will provide enterprise support for the product.
"The product itself will be closed," Wensel explains. "Driven is an open service and freely available for use, but you don't get the source code. We feel this is the most natural, non-conflicting way to monetize open source."
"We want the initial offering to be online so we can iterate and learn," he adds. "We're putting out a basic initial set of features. We have a long list of things we're going to want to add."
Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for CIO.com. Follow Thor on Twitter @ThorOlavsrud. Follow everything from CIO.com on Twitter @CIOonline, Facebook, Google + and LinkedIn.