10 Hot Hadoop Startups to Watch

As data volumes grow, figuring out how to unlock value becomes vastly important. Hadoop enables the processing of large data sets in a distributed environment and has become almost synonymous with big data. Here are 10 startups with solutions for unlocking big data value.

Page 2 of 2

Competitive Landscape: DataTorrent's main competitors come from IBM (Infosphere Streams) and the Storm Open Source Project.

Key Differentiator: DataTorrent points to performance as a key differentiator, claiming their platform is 100-1,000 times faster than Storm.

7. Qubole

Qubole, Hadoop, big data

What They Do: Offer Big Data-as-a-Service with a "true auto-scaling Hadoop cluster."

Headquarters: Mountain View, Calif.

CEO: Ashish Thusoo, who ran Facebook's data infrastructure team before co-founding Qubole. He also co-founded Apache Hive.

Founded: 2011

Funding: The company is backed by $7 million in Series A funding from Lightspeed Ventures and Charles River Ventures.

Why They're on This List: Since Hadoop is a relatively new technology, finding someone with the expertise necessary to run and maintain it can be a tall order. By providing a managed solution, Qubole hopes to make Hadoop an easy-to-use technology.

Qubole handles the initial setup and then maintains the clusters. Qubole's auto-scaling feature automatically spins up users' clusters when a job is started and automatically scales or contracts based on workload, cutting back on costs and management requirements.

An intuitive UI expands the reach of this service beyond data analysts to entire lines of businesses. Qubole contends that some customers have more than 60 percent of their employees using Qubole.

Customers include Pinterest, MediaMath, Nextdoor and Saavn.

Competitive Landscape: Qubole will compete with Altiscale, Amazon EMR, Treasure Data, and others.

Key Differentiator: Qubole points to its proprietary technology that provides true auto-scaling and storage optimization.

8. Continuuity

Hadoop, big data, Continuuity

What They Do: Provide a Hadoop-based big data application hosting platform.

Headquarters: Palo Alto, Calif.

CEO: Jonathan Gray, who was previously an HBase software engineer at Facebook.

Founded: 2011

Funding: $12.5 million from Battery Ventures, Ignition Partners, Andreessen Horowitz, Data Collective and Amplify Partners.

Why They're on This List: Continuuity has come up with a clever way to get around the dearth of Hadoop experts: they offer an application developer platform targeted at Java developers. The lower-level infrastructure is all abstracted away by the Continuuity platform.

The company's flagship product, Reactor, is a Java-based integrated data and application framework that layers on top of Apache Hadoop, HBase, and other Hadoop ecosystem components. It surfaces capabilities of the infrastructure through simple Java and REST APIs, shielding end users from unnecessary complexity.

In late March, Continuuity released its latest service, Loom, a cluster management solution. Clusters created with Continuuity Loom utilize templates of any hardware and software stack, from simple standalone LAMP-stack servers and traditional application servers like JBoss to full Apache Hadoop clusters comprised of thousands of nodes. Clusters can be deployed across many cloud providers (Rackspace, Joyent, OpenStack) while utilizing common SCM tools (Chef and scripts).

One thing to keep an eye in is the CEO situation. Founding CEO Todd Papaioannou, who was previously vice president and chief cloud architect at Yahoo, left the company this past summer. Co-founder and previous CTO Jonathan Gray has taken over the CEO role. This is Gray's first role as a business leader.

Competitive Landscape: As of now, Continuuity is uniquely positioned. Indirect competitors come from the HaaS camp (AWS EMR, Altiscale, Infochimps, Mortar Data, etc.).

Key Differentiator: Continuuity is targeted at Java developers, which is a unique approach.

9. Xplenty

Xplenty, Hadoop, big data

What They Do: Provide HaaS.

Headquarters: Tel Aviv, Israel

CEO: Yaniv Mor, who previously managed the NSW SQL Services practice at Red Rock Consulting.

Founded: 2012

Funding: An undisclosed amount of seed funding from Magma Venture Capital.

Why They're on This List: While Hadoop is being hyped like crazy these days, it has become the de facto infrastructure technology for big data. The trouble is that the development, implementation, and maintenance of Hadoop require a very specialized skill set.

Xplenty technology provides Hadoop processing on the cloud via a coding-free design environment, so businesses can quickly and easily benefit from the opportunities offered by Big Data without having to invest in hardware, software, or highly specialized personnel.

A drag-and-drop interface eliminates the need to write complex scripts or code of any kind. With its automatic server configuration feature, users can simply point to a data source, configure the data transformation tasks, and tell the platform where to write the results to. Xplenty's platform uses SQL terminology. Thus, for data analysts, the learning curve should be minimal.

Customers include DealPly Technologies, Fiverr, Iron Source, and WalkMe.

Competitive Landscape: The main competition comes from Amazon's EMR. Other HaaS competitors include Altiscale, Mortar Data, Qubole, and recently Microsoft with Hadoop on Azure. Rackspace is about to launch its own HaaS offering based on Hortonworks' distribution.

Key Differentiator: According to Xplenty, competing services still target developers, whereas Xplenty targets the data and Business Intelligence (BI) users who do not know how to write code, but who need to move data to a big data platform.

10. Nuevora

Hadoop, big data, Nuevora

What They Do: Provide Big Data analytics applications.

Headquarters: San Ramon, Calif.

CEO: Phani Nagarjuna, who most recently served as executive vice president of products and business development for OneCommand, which provides a SaaS-based CRM and Loyalty Automation Platform for the auto retail industry.

Founded: 2011

Funding: $3 million in early funding from Fortisure Ventures.

Why They're on This List: Nuevora has set its sights on one of big data's early growth areas: marketing and customer engagement. Nuevora's nBAAP (Big Data Analytics & Apps) Platform features purpose-built analytics apps based on best-practices-driven predictive algorithms. nBAAP is based on three key big data technologies: Hadoop (data processing), R (predictive analytics), and Tableau (visualizations).

On top of all of this, Nuevora's algorithms work on disparate sources of data (transactional, social media, mobile, campaigns) to quickly identify patterns and predictors in order to tie specific goals to individual marketing tactics.

The platform includes pre-built apps for the customer marketing business process -- acquisition, retention, up-sell, cross-sell, profitability, and customer lifetime value (LTV). With only "last-mile" configurations required for individual customer situations, Nuevora's apps empower organizations to anticipate their customers' behaviors.

Competitive Landscape: When Nuevora assesses the competitive landscape, it zeroes in on big consulting firms, such as Accenture, and other predictive analytics companies, such as Alpine Data Labs.

However, since pretty much every marketing platform under the sun now includes some sort of analytics engine, I also expect them to compete with the major marketing automation providers, such as ExactTarget (which uses Pentaho for its big data analytics).

Key Differentiator: Nuevora gives end users the ability to continually recalibrate their predictions using a "closed-loop recalibration engine," which helps organizations keep up with only the most pertinent insights based on the latest data.

Jeff Vance is a freelance writer based in Santa Monica, Calif. Connect with him on Twitter @JWVance or by email at jeff@sandstormmedia.net.

Follow everything from CIO.com on Twitter @CIOonline, on Facebook, and on Google +.

| 1 2 Page 2
New! Download the CIO March/April Digital Magazine