WASHINGTON -- As agencies in the federal government consider new IT initiatives to handle the challenges presented by soaring volumes of structured and unstructured data, they are playing catch-up with their industry counterparts. But a big data strategy is essential to ensure that federal IT departments glean valuable information from far-flung sources and, most importantly, effectively execute their mission to serve constituents, according to Sameer Kalbag, CTO at the federal division of HP's Autonomy.
In remarks to federal workers and contractors at a government IT conference, Kalbag described the twin pressures that government agencies are facing from contracting budgets and the surge in data--not just in volume, but from an influx of content in new forms and from new sources, such as social media, sensor technology and audio and video, amounting to what he called an "untenable situation."
"It's not a surprise to anyone that data is growing by exponential rates," Kalbag says. "What's different today is really the variation and the types of data out there."
Data Is Not Just Bigger
Part of that shift stems from what might be described as the democratization of content production. Kalbag points out that not so many years ago, even when businesses were dealing with enormous data sets, that information came from relatively few sources.
Fast-forward to a time when blogging already seems a quaint technology and the platforms for expression online are innumerable, and the challenges of gleaning useful insights from vast stores of disparate data are far greater than in the early days of the Web.
"The way content is being created is fundamentally different," Kalbag says. "Extracting value out of a video is a lot different than looking at rows and columns in a database."
And just by the numbers, federal IT shops have their hands full. Earlier this year, MeriTalk, a research group and networking community for government IT workers, surveyed departments and agencies and found that 87 percent of federal IT workers reported that the volume of data they deal with had increased over the past two years. Then 96 percent of respondents said they expect their data volumes to rise over the next two years, by an average of 64 percent. Nearly one-third of federal data was described as unstructured.
The survey also reveals that federal IT workers are still grappling with their approach to big data, with nine in 10 reporting challenges with their implementations. On average, survey respondents say that they don't expect to take full advantage of big data solutions for another three years.
That projection is underscored by the relatively underdeveloped programs for data analysis currently underway in the government. In MeriTalk's survey, published in May, 60 percent of IT workers said that their agency analyzes the data it collects, while just 40 percent reported that they are using that data to inform strategic decisions.
While the momentum for big data initiatives is growing in the government, Kalbag says that federal IT workers generally do not operate under the same competitive pressure as their counterparts in the private sector where, according to HP, the difference between winners and losers is often a matter of which company makes the best use of the data at its disposal.
Slideshow: 8 Real-World Big Data Deployments
"If you look at the commercial sector, the ones that are actually differentiating themselves and moving ahead and basically winning the marketplace, are the ones who are spending a lot of their energies in doing mathematically based analytics that's defensible in terms of understanding how their business works, what factors in the real world actually affect who buys what, and how their business operates, and then optimizing their businesses based on that information," he says.
"There's a lot more impetus for big data in the commercial space than we have seen in the federal government, because for them it's a matter of survival. Either they're going to do this and move forward, or they're going to be left behind," Kalbag says.
Feds Get Big Data
In the federal government, big data has won the backing of the upper reaches of the administration. In March, the White House Office of Science and Technology Policy teamed with six agencies and departments, including the National Science Foundation and Department of Defense, to launch the Big Data Research and Development Initiative, a program to spur the development and application of big data technologies throughout the government.
HP, like many other tech heavyweights, is vying for government contracts to assist departments and agencies with their big data projects. Kalbag pitched HP's technology as a federated solution that tackles data at the system level that integrates software and hardware, networking and storage, emphasizing its compatibility with the popular Hadoop offering.
Traditionally, data analytics were the province of business intelligence applications, Kalbag says. And the approach of culling information from disparate systems and compiling it in a data warehouse was more or less unchanged even as IT progressed from mainframes to the client-server model and then onto the Web.
However, typical BI systems, including those in use across the government, lack capabilities like sentiment inference and artificial intelligence that can mine unstructured data sources to produce a more holistic data set that can be mined for forward-looking, predictive analytics.
"The fundamental issue with that approach is you're always looking backwards," Kalbag says of BI applications. "Second, it often tells you what happened but it doesn't tell you why it happened."
There are "lots of opportunities for agencies in government to take advantage of big data," he says. "We think the commercial market is a little ahead of government in the big data space, but there are lessons to be learned."
Kenneth Corbin is a Washington, D.C.-based writer who covers government and regulatory issues for CIO.com.