It's a bit of commonly accepted wisdom in big data circles that small companies and startups will be the ones to drive big data technology forward and define the shape of the market to come. The large players, like IBM and Oracle, unable to adjust quickly enough to a world changing at breakneck speed, will see bits of their business intelligence market stripped away by smaller, more agile competitors.
The stakes are high: Tranparency Market Research has forecast the global market for big data will grow to $48.3 billion by 2018.
But IBM isn't taking the common wisdom lying down. Last week it announced the new IBM Watson Group business unit at a splashy New York City event as a signal not just of its intention to step up competition in the big data market, but its plan to leapfrog into a leadership position with what it has dubbed a "new era of computing."
In 2011, IBM introduced the world to Watson, a supercomputer that could play the TV game show Jeopardy! and win against human opponents. Not just any opponents either: Watson made mincemeat out of Jeopardy! champions Brad Rutter and Ken Jennings. Rutter holds the record for Jeopardy! winnings while Jennings holds the record for longest Jeopardy! winning streak with 74 straight wins.
-- Michael Rhodin, head of Watson Group
"In 2011, we introduced a new era [of computing] to you. It is cognitive," says IBM CEO Ginni Rometty. "It was a new species, if I could call it that. It is taught, not programmed. It gets smarter over time. It makes better judgments over time. Why did we take Watson on? It's built for a world of big data. It has the potential to transform businesses and industries everywhere."
"It is not a super search engine," she adds. "It can find a needle in a haystack, but it also understands the haystack."
What Is Cognitive Computing?
In other words, IBM says cognitive computing systems like Watson are capable of understanding the subtleties, idiosyncrasies, idioms and nuance of human language by mimicking how humans reason and process information.
Whereas traditional computing systems are programmed to calculate rapidly and perform deterministic tasks, IBM says cognitive systems analyze information and draw insights from the analysis using probabilistic analytics. And they effectively continuously reprogram themselves based on what they learn from their interactions with data.
"If you take it at its essence, at its core, it's a system that understands natural language," says Michael Rhodin, formerly senior vice president of IBM's Software Solutions Group, who has been tapped to lead the Watson Group. "It reads. When it reads a lot, it adapts and learns. It gets smarter. When you ask it questions, it will generate hypotheses—potential answers—with a degree of confidence."
"It doesn't just learn from what it knows today," Rhodin adds. "You can add new data to it. It reads new books every day. It connects the dot from what it just read to what it has already read. Sometimes what it just read contradicts what it's already read. It has to sort that out. As we start to move forward, Watson's getting smarter. We're adding new capabilities to it. It's learning to reason, to think through things."
Cognitive computing really comes down to three abilities, according to IBM:
- The ability to perform deep natural language processing and analysis both for information ingestion and research, as well as to provide human-style communication.
- The ability to statistically generate and evaluate series of evidence-based hypotheses to be able to answer questions in a relevant and meaningful manner.
- The ability to adapt and learn from training, interaction with humans and outcomes related to hypotheses it generates.
"These abilities make Watson ideal for applications where large amounts of information need to be ingested and understood, complex decisions are made and feedback is available to train the application to improve its decision making over time," says IDC's David Schubmehl, research director for the research firm's Search, Content Analytics and Discovery research.
-- Matt Assay, MongoDB
"Today, consumers continue to struggle to find and use actionable information," Schubmehl adds. "Traditional search systems deliver web pages, documents, video and audio to users when what they are really looking for is answers and advice."
"The technology behind IBM's Watson takes all of these information sources and distills them down to the important facts, events and relationships. Its natural language capabilities, hypothesis generation, cognitive analytics and machine-learning components then utilize these facts, events and relationships to answer questions in the same manner that a human would. In addition, if it is wrong or incorrect, Watson has feedback facilities built into it so that it can learn and get "smarter" over time," Schubmehl says.
IBM Struggling to Gain a Big Data Foothold With Watson
And yet, for all that, IBM has been struggling to gain traction for Watson in the commercial sphere since the high-profile Jeopardy! win, but response has been tepid. According to The Wall Street Journal, Watson has brought IBM only $100 million in revenue in the three years since its debut.
Part of that may be attributed to what ran under Watson's hood. The Watson that won Jeopardy! was built on 90 IBM Power 750 Express servers powered by 8-core processors—four in each machine for a total of 32 processors per machine. At the time, Power 750 servers were running $34,500 a piece, adding up to about $3 million.
While IBM characterized the price as affordable, particularly in the healthcare vertical Big Blue was targeting, the alternative was a Hadoop cluster built with open source software and commodity servers. Scaling with commodity hardware is a very different proposition from scaling with big-ticket servers.
"There are several reasons for such stunted growth, but the biggest may simply be that IBM is a luxury in a world of commoditized, open-source big data analytics," says Matt Assay, vice president of business development and corporate strategy at MongoDB. "Why pay millions for Watson when you can run Hadoop for free?"
In addition, Assay notes that the WSJ found that the way Watson learns means that IBM's engineers have to learn the technicalities of a customer's business and translate them for Watson.
"In other words, Watson is like hiring an expensive data scientist, except not nearly as thoughtful," Assay says. "Far better for the customers in question to learn Hadoop or other big data technologies and ask questions of the data themselves than to pay both for IBM's expensive consultants and its big data technology, which happens to be Hadoop under the covers, anyway."
Finally, Assay asks, where are the developers? He calls Watson's lack of a developer ecosystem its "most glaring omission."
Big Blue Seeking to Answer Doubts with New IBM Watson Group
With the new Watson Group, IBM is aiming to turn all that around. The group, which IBM will scale up to 2,000 employees, is set to develop and run cloud-based cognitive applications and services on behalf of enterprise users.
IDC's Schubmehl says the current Watson cloud service can support millions of users, supports dialog chaining for input, can ingest and train in hours and supports a broad industry corpus of knowledge.
Big Blue has earmarked a $1 billion investment for the group, including $100 million for investing in startup companies building applications that will run on its new Watson Developer Cloud, a cloud-hosted marketplace to application providers developing Watson-powered apps. The Developer Cloud will support the Watson Ecosystem, which is a cloud-based implementation of Watson that partners and third-party developers can use to embed cognitive capabilities into new or existing applications.
"We're investing $1 billion in this over the next few years," Rhodin says. "We're going to share Watson with the world. Eras are not ours alone. We just happen to have a history of shepherding them and bringing them to the entire world. We make markets. And that's what we're going to do."
"We recognize that the power of this technology is really what it can do for everyone," he adds. "To get to everyone, we need help. We need an ecosystem. We need partners. We think everyone that decides to join us is going to change the world."
A number of early IBM Watson Ecosystem participants have already begun showing off early versions of Watson-powered apps slated for release this year, including the following:
- MD Buyline, a provider of supply chain solutions for hospitals and healthcare systems, which is developing an app to allow clinical and financial users make real-time, informed decisions about medical device purchases.
- Welltok, a specialist in social health management, which is developing a mobile and web-based consumer app that will help users create "Intelligent Health Itineraries," which reward users for engaging in health behaviors.
- Fluid, a startup that builds online shopping experiences for major brands like Reebok and Brooks Brothers, which is building a personal shopper app.
"The Watson Ecosystem is going to facilitate a wide range of these types of expert recommendation applications over the next several years and is going to continue to fuel research and innovation in using the tremendous growth in digital information to improve a wide range of processes from medical diagnosis to shopping to scientific research," says IDC's Schubmehl.
Watson Big Data Services
Out of the gate, IBM is offering three new services based on Watson's cognitive intelligence: Watson Discovery Advisor, Watson Analytics and Watson Explorer.
Watson Discovery Advisor is intended to help find the right questions in their data. IBM says it will revolutionize how industries like pharmaceutical and publishing conduct their research by reducing the time researchers need to formulate conclusions that can advance their work. For instance, with Discovery Advisor, Watson can pour through millions of articles, journals and studies, determine context, synthesize the data and help users pinpoint connections.
IBM has been working closely with Elsevier, a leading provider of scientific, technical and medical information products and services, including texts like Gray's Anatomy and journals like The Lancet, to explore how Watson's cognitive technologies can be used to help clinicians stay up-to-date on the medical knowledge necessary to give their patients the best possible care—an increasingly difficult task as medical data continues to grow at an exponential rate.
Watson Analytics allows users to explore big data insights through visual representations, without the need for advanced analytics training. IBM says the service removes common impediments in the data discovery process, giving business users the ability to quickly and independently uncover insights in their data. Watson Analytics preopares the data, surfaces the most important relationships and presents the results in an interactive visual format.
Finally, Watson Explorer is intended to help users across the enterprise uncover and share data-driven insights through a unified view that displays all of a user's data-driven information, as well as a framework for developing information-rich applications that deliver a comprehensive, contextually relevant view of topics.
In the second half of this year, IBM plans to roll out Watson Engagement Advisor, intended to help businesses redefine their engagement with customers. DBS Bank announced last week that it will apply Watson, including Engagement Advisor, to its wealth management business to improve the advice and experience it delivers to affluent customers.
Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for CIO.com. Follow Thor on Twitter @ThorOlavsrud. Follow everything from CIO.com on Twitter @CIOonline, Facebook, Google + and LinkedIn.