From healthcare to marketing, analytics, Twitter, social radar and sentiment analysis—and, it seems, just about everything in between—new and old service providers alike are emerging to bring big data to the masses of yearning business users who don’t even know they want, let alone need, big data services yet.
Today, companies are faced with overwhelming IT demands aimed merely at keeping the lights on. Many can ill-afford the time or the resources to hire data scientists, assemble massive data sets and deploy Hadoop, R, MapReduce and all the other technologies needed to crunch very large, often non-relational data sets.
How-To: Avoid Big Data Spending Pitfalls
News: Explosion in “Big Data” Causing Data Center Crunch
The insights waiting to be gleaned from big data, can only be ignored at great peril. Organizations of all shapes and sizes therefore find themselves in the midst of yet another technology-led evolution, the vastness and reach of which is only now starting to be appreciated.
Service Providers Help Companies Address Big Data Skills Shortage
Fueled by the many technological advances of the past decade—cloud computing, mobile devices, Wi-Fi, 3G, 4G, LTE and the continued trajectory of Moore’s Law—big data is giving rise to companies that supply raw data sets, cleansed data sets, data scientists, storage and, perhaps most importantly, analytics.
Some companies have been in the business for years. Data as a service (DaaS) providers such as Dun & Bradstreet, LexisNexis or Thomson Reuters have been around for a long time. (They just lacked the buzzword.) For firms such as Opera Solutions, this is outsourcing on steroids. It’s an opportunity not to do what companies already do for themselves but, rather, to offer companies a solution to a problem they are not prepared to take on themselves—at least, not without another big investment in new technologies, new infrastructure and more processing power.
“Companies want and need to get insights from exponentially growing amounts of diverse data, but most lack the skill and computing infrastructure needed,” says Gartner Analyst Rita Sallam. “One way to fill the gap is through service providers.”
Tip: Cascading, Open Source Java App Framework, Can Ease Big Data Hiring Challenge
According to Dun & Bradstreet CIO Walt Hauck, these service providers will be very good at helping you with first-order business problems, such as whether a marketing campaign is selling more products or what features customers are using in a particular product. However, he says, that’s unlikely to happen with industry-specific questions that involve complex intensive analysis to find cause-and-effect relationships?
“You think an oil company’s going to send [a service provider] a bunch of seismic data and say, ‘Should we drill here?’ Probably not,” Hauck says.
Aside from the technology limitations, a shortage of data scientists drives this trend. The propeller heads who used to do regression and cohort analysis for the marketing department now find themselves in high demand and short supply. Great for them. Bad for you.
“There is a storm approaching on the big data talent front,” according to the recent Big Data Executive Survey from New Vantage Partners. Writing in the foreword, Harvard Professor Thomas Davenport, co-founder and director of research at the International Institute for Analytics, notes that “70 percent [of respondents] say they plan to hire data scientists, but they already find this ‘challenging’ to ‘extremely difficult,’ and there is no reliable source of new talent in this category. It would seem to be a wise move to begin ‘building’ such talent as well as ‘buying’ it.”
This new breed of data scientist needs not only to understand the new technologies around Big Data—the previously mentioned Hadoop, MapReduce and R as well as Pig, Hive, VMware Serengeti, NoSQL databases and so on—but also has to understand how a business and its vertical market works.
Analysis: Does VMware Move Signal That Big Data Is Ready for Prime Time?
This a very rare breed at the moment, says Paul Barth, managing partner and founder of New Vantage. “To do these analyses, you still need really smart analysts. That’s kind of a daunting resume.”
Market for Big Data As a Service Small But Growing
According to the EMC report Big Data as a Service: A Market and Technology Perspective, the market for BDaaS remains small but will grow as more startups get funding, which reports suggests is not too difficult to do right now.
Then there are the big players who are also taking a bite out of this apple.
- EMC is pushing its integrated stack—Greenplum HD, an enterprise-ready Hadoop platform, and Isilon NAS for Hadoop—to Big Data platform providers looking to take on big Hadoop jobs for clients. (Would that be BD/PaaS?)
- Opera Solutions has grown from 10 data scientists in 2004 to 220 today. The company offers firms in the Global 250 and large governmental organizations a semi-turnkey Big Data solution set aimed at analytics and insights.
- LexisNexis Risk Solutions is playing this field with its high-performance computing cluster (HPCC) technology based on the ECL database programming language, which itself is a rival approach to crunching big numbers to Hadoop.
- Trend Micro is in the game from a data supply point of view. The company has agreements with the U.S. government’s Community Emergency Response Teams (CERT) and the Canadian government to supply them with a daily 5 TB feed listing all the malicious activity the organizations see. The company also signed on with Facebook to help it thwart would-be attackers. In essence, Trend Micro, Symantec and McAfee will take feeds from Facebook’s 1 billion users, look for malicious links and alert Facebook to shut them down, says Steve Quane, Trend’s chief product officer. (Companies with a smaller user base can opt instead for the company’s Threat Intelligence Manager services.)
The more you peel, the bigger this onion gets. There’s Google BigQuery and Amazon DynamoDB, a beta version of a NoSQL database. Then there’s Gnip, which aggregates and normalizes social media data streams from the Web for marketing and social media analysis companies such as Alterian, FirstRain and Attensity—which themselves could be considered BDaaS providers. Don’t forget about the cloud storage providers that will be integral to capturing and housing all this data while you figure how to use it.
How-To: Best Practices for Selecting Storage Services for Big Data
Paul Ballew, chief data and analytic officer for Dun & Bradstreet, notes that companies are struggling through intense competition, weak economic conditions and substantial structural changes brought about by increased regulation.
“Against this backdrop, lots of emerging companies are attempting to help firms navigate through the waters,” Ballew said. “Some do this by helping to bring together a complete view of the customers. Others provide analytic engines on existing data environments. Others [consulting firms] still try to do it all. At the end of the day, there will be a great opportunity for firms that can bring it all together in a comprehensive manner.”
Allen Bernard is a Columbus, Ohio-based writer. You can reach Bernard via email or follow him on Twitter @allen_bernard1. Follow everything from CIO.com on Twitter @CIOonline, on Facebook, and on Google +.