The ODPi Runtime Specification and test suite is designed to help organizations write Hadoop-based applications once with the confidence that they will run on a variety of Hadoop distributions. Credit: Thinkstock The Open Data Platform Initiative (ODPi) released its first ODPi Runtime Specification and test suite Monday as part of its goal to ensure a standard deployment model for enterprise big data applications across Apache Hadoop distributions. “This is the culmination of this whole year’s work,” says John Mertic, senior manager of ODPi. The nonprofit ODPi formed last year in an effort to reduce the amount of complexity surrounding the Hadoop and big data environment. The idea was to provide a big data kernel in the form of a tested reference core of Apache Hadoop, Apache Ambari and related Apache source artifacts. The kernel, called ODPi Core, would be used to simplify upstream and downstream qualification efforts — a “test once, use everywhere” core platform that could eliminate the growing fragmentation in the space. Applications and tools built on the reference platform should integrate with and run on any compliant system. In September of last year, ODPi officially became a collaborative project of the Linux Foundation. Mertic explains ODPi is an effort to bring together constituents from all the various “party lines” with a stake in the big data ecosystem. “What we really wanted to do was to make sure we could have the community well represented,” he says. “The biggest feedback that we got was that each distro does things slightly differently; they name their files differently; their APIs behave differently.” The new runtime specification descends from Apache Hadoop 2.7 and features HDFS, YARN and MapReduce components. Mertic says the test framework and self-certification aligns closely with the Apache Software Foundation by leveraging Apache Bigtop for comprehensive packaging, testing and configuration. More than half the code in the latest Bigtop release originated in ODPi. The ODPi Runtime-Compliance tests are linked directly to lines in the ODPi Runtime Specification. To assist with compliance, ODPi has also provided a reference build. The organization says the published specification includes rules and guidelines on how to incorporate additional, non-breaking features, which are allowed provided source code is made available through relevant Apache community processes. “It was a little over a year ago that ODPi was formed, and we have already proved beneficial to upstream ASF projects (Hadoop, Bigtop, Ambari),” says Roman Shaposhnik, director of Open Source at Pivotal, and an Apache Hadoop and Bigtop committer and ASF member. “This is why the first release of the ODPi Runtime Specification and test suite is so exciting. It is a big step toward realizing our goal of accelerating the delivery of business outcomes through big data solutions by driving interoperability on an enterprise-ready core platform.” “Big data is the key to enterprises welcoming the cognitive era and there’s a need across the board for advancements in the Hadoop ecosystem to ensure companies can get the most out of their deployments in the most efficient ways possible,” Rob Thomas, vice president of product development, IBM Analytics, added in a statement Monday. “With the ODPi Runtime Specification, developers can write their application once and run it across a variety of distributions — ensuring more efficient applications that can generate the insights necessary for business change.” With the Runtime Specification out the door, Mertic says the next focus will be the ODPi Operations Specification to help enterprises improve installation and management of Hadoop and Hadoop-based applications. It covers Apache Ambari, which is used for provisioning, managing and monitoring Hadoop clusters. Mertic expects the Operations Specification will be ready this summer. The ODPi is also getting ready to decide what it will focus on after that. Mertic explains that each ODPi member, regardless of size or investment, has exactly one vote. Some possibilities include work around Spark, Kafka, HBase and Hive. Related content opinion Website spoofing: risks, threats, and mitigation strategies for CIOs In this article, we take a look at how CIOs can tackle website spoofing attacks and the best ways to prevent them. By Yash Mehta Dec 01, 2023 5 mins CIO Cyberattacks Security brandpost Sponsored by Catchpoint Systems Inc. Gain full visibility across the Internet Stack with IPM (Internet Performance Monitoring) Today’s IT systems have more points of failure than ever before. Internet Performance Monitoring provides visibility over external networks and services to mitigate outages. By Neal Weinberg Dec 01, 2023 3 mins IT Operations brandpost Sponsored by Zscaler How customers can save money during periods of economic uncertainty Now is the time to overcome the challenges of perimeter-based architectures and reduce costs with zero trust. By Zscaler Dec 01, 2023 4 mins Security feature LexisNexis rises to the generative AI challenge With generative AI, the legal information services giant faces its most formidable disruptor yet. That’s why CTO Jeff Reihl is embracing and enhancing the technology swiftly to keep in front of the competition. By Paula Rooney Dec 01, 2023 6 mins Generative AI Digital Transformation Cloud Computing Podcasts Videos Resources Events SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe