Concurrent says Cascading 3.0 will support local in-memory, Apache MapReduce and Apache Tez out of the gate with support for Apache Spark and Apache Storm soon to follow. Organizations are increasingly focusing on building enterprise data applications on top of their Hadoop and NoSQL infrastructure. But even as that’s happening, Hadoop itself is becoming much more diverse and complex. That’s a potential headache for developers seeking to build applications on top of that data infrastructure, but data application platform specialist Concurrent, primary sponsor of the open source Cascading application framework, sees it as an opportunity. While Apache Hadoop began as a combination of Hadoop Distributed File System (HDFS) for file storage and MapReduce for compute, there are now a growing number of options for compute in Hadoop, including Apache Tez (a framework for near real-time big data processing), and the soon-to-be-released Apache Spark (a framework for in-memory cluster computing) and Apache Storm (a distributed computation framework for stream processing). Hadoop distribution vendor MapR even offers an alternative to HDFS in its distribution. [Related: Concurrent Offers Performance Management for Big Data Applications]“Thinking in MapReduce is one thing, but then having to think in Tez is something else,” says Chris Wensel, founder and CTO of Concurrent and original author of Cascading. “It’s a huge challenge.” “Hadoop is balkanizing and fracturing,” he adds. “There is no more Hadoop. There’s HDFS and whatever runs on top of it.” Cascading Is a Software Abstraction Layer for HadoopCascading is a software abstraction layer for Apache Hadoop that is intended to allow developers to write their data applications once and then deploy those applications on any big data infrastructure, regardless of the components in use. That’s what has allowed Concurrent to win big Web 2.0 customers like eBay, LinkedIn, Twitter and Pinterest (as well as a slew of others) and what now contributes to more than 150,000 user downloads a month. Customers use it to make applications ranging from enterprise IT uses like ETL and operational analysis, to corporate apps like HR analytics, telecom apps like location-based services, marketing apps like funnel analysis and ad optimization, consumer/entertainment apps like music recommendations, finance apps like fraud and anomaly detection and health/biotech apps like veterinary diagnostics and next-generation genomics. [Related: Big Data Application Framework Gets Update, SQL Interface]Wensel says he originally wrote Cascading in anger — after using MapReduce once, he was determined that no one would have to use it directly again. Now, with Cascading 3.0, announced today, the framework will go even farther — it’s not just about MapReduce anymore. Cascading 3.0 Will Support Emerging Big Data FabricsCascading 3.0 will allow data apps to execute on existing and emerging fabrics through its new customizable query planner, says Wensel. When released it will support local in-memory, Apache MapReduce and Apache Tez out of the gate, with support for Apache Spark and Apache Storm soon to follow. The idea is to allow enterprises to standardize on one API that will allow them to build data applications to solve a variety of business problems ranging from simple to complex, regardless of latency or scale. In addition, Wensel says third-party products, data applications, frameworks and dynamic programming languages built on Cascading (like Scalding or Cascalog) will immediately benefit from the portability. Concurrent has also forged close strategic partnerships with Hortonworks (one of the primary sponsors of Apache Hadoop) and Databricks (the primary sponsor of Apache Spark). Hortonworks will now integrate the Cascading SDK with its Hortonworks Data Platform (HDP) distribution of Hadoop, and will certify and support the SDK with HDP. Cascading will also support Apache Spark in a future release and notes that companies using Cascading will be able to seamlessly run their applications on Spark. [Related: Open Source ‘Lingual’ Helps SQL Devs Unlock Hadoop]Concurrent says Cascading 3.0 will be available early this summer and freely licensable under the Apache 2.0 License Agreement. Follow Thor on Google+ Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for CIO.com. Follow Thor on Twitter @ThorOlavsrud. Follow everything from CIO.com on Twitter @CIOonline, Facebook, Google + and LinkedIn. Related content opinion Website spoofing: risks, threats, and mitigation strategies for CIOs In this article, we take a look at how CIOs can tackle website spoofing attacks and the best ways to prevent them. By Yash Mehta Dec 01, 2023 5 mins CIO Cyberattacks Security brandpost Sponsored by Catchpoint Systems Inc. Gain full visibility across the Internet Stack with IPM (Internet Performance Monitoring) Today’s IT systems have more points of failure than ever before. Internet Performance Monitoring provides visibility over external networks and services to mitigate outages. By Neal Weinberg Dec 01, 2023 3 mins IT Operations brandpost Sponsored by Zscaler How customers can save money during periods of economic uncertainty Now is the time to overcome the challenges of perimeter-based architectures and reduce costs with zero trust. By Zscaler Dec 01, 2023 4 mins Security feature LexisNexis rises to the generative AI challenge With generative AI, the legal information services giant faces its most formidable disruptor yet. That’s why CTO Jeff Reihl is embracing and enhancing the technology swiftly to keep in front of the competition. By Paula Rooney Dec 01, 2023 6 mins Generative AI Digital Transformation Cloud Computing Podcasts Videos Resources Events SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe