Health and Human Services CTO sees big future in big health data as the government looks to expand its repository of publicly available, machine-readable data sets.
By Kenneth Corbin
WASHINGTON — The white-hot controversy surrounding President Obama’s healthcare overhaul — grabbing headlines again amid the latest round of budget fights — has largely overshadowed other areas where the administration is crafting health policy, including how to wring more value from the vast stores of data maintained by the federal government.
Historically, data sets about disease rates, clinical records, Medicare billing and other issues have been kept tight under lock and key within the sprawling confines of the country’s largest health-data warehouse.
Freeing Data to Public
Many of them still are, to be sure. But administration officials have been warming to the potential to improve patient outcomes and lower costs by releasing more health data to the public, inviting developers, researchers and others to comb through the data sets that the government has compiled, according to Bryan Sivak, the CTO of the Department of Health and Human Services.
“Data is a big part of the future of healthcare in this country,” Sivak said in a keynote address here at a health care policy conference hosted lasat week by the Healthcare Information and Management Systems Society, a not-for-profit group focused that works to advance health IT.
HHS’s efforts fall within a broader administration initiative to advance open data policies. Most recently, that effort produced an executive order directing agencies to establish open, machine-readable formats as the default for government information.
At HHS, the push toward open data led to the creation of HealthData.gov, a clearinghouse for data sets encompassing everything from vaccination rates to hospital comparisons freely available for download.
“We want to make this data as easy to find and use as, say, using OpenTable to make a dinner reservation. So that’s why we created HealthData.gov in the first place, which is really the central catalog that we have in the department for all of the data sets that we can find,” Sivak said.
The portal launched in 2010 with about 30 data sets. The catalog has since swollen to include more than 1,000, and now offers at least one from every operating division within the department.
Making Data Both Open and Useful
Sivak boasted that in that short time, the department has widely embraced the culture of open data, what he described as a significant turnaround (if still a work in progress) in an area of the government that oversees stores of highly sensitive information.
“The default setting within HHS has really changed from closed to open,” Sivak said.
But HHS’s work doesn’t end with making the data available. Sivak, a veteran of the software industry, is well-aware that open data doesn’t necessarily mean useful data. After all, if the research outputs of the National Institutes of Health or Centers for Disease Control were all published in PDF format, they would be of scant use to the developer or entrepreneur looking to build an app or even a business on top of that information.
Instead, in the spirit of Obama’s directive on machine-readable data, Sivak and his team are encouraging healthcare workers and researchers both within and outside the federal government to adopt common, developer-friendly formats for the new data sets they create. Additionally, the tech team at HHS behind HealthData.gov is working to convert older data sets into a machine-readable format as they come online and are available for download.
“The problem is that it’s not really useful if it’s not in a usable format, so one of the things that we’re also focused on is what we call data liquidity. What we want to do is we want to increase the connectibility and the use of health data, and I think this is where technology can help us out quite a bit,” he said.
“What we want to do is we want to encourage people to build data in formats such as XML or machine-readable formats. Maybe even better put application programming interfaces on top of those data sets so developers, when they write applications, they don’t even have to suck that data in,” Sivak said.
“All they have to do is make a functional call to a server or to an application that already exists out there and get the data in real time. That way they don’t have to worry about updating it. They don’t have to worry about what format it’s in. It’s just very, very simple to use.”
Going Beyond Feds for Data
In its bid to expand HealthData.gov into a centralized hub of health data, HHS is hoping to attract data sets from outside the federal government. Already, the site includes 61 data sets from New York State, but Sivak sees that only as a start.
Last Thursday, he put out an open call for health data from other state agencies and members of the private sector, noting that the submission of a data set to the HHS catalog only expands its visibility, and does not entail any transfer of ownership.
“We don’t ingest them. We just basically create a pointer to them, right, so people will always get pointed back that original location of where the data is,” Sivak said. “But think of that — if we could have a massive catalog of all of the health data that’s available out there in one place, that’s a great resource for anybody who’s trying to do anything with this ecosystem.”
Kenneth Corbin is a Washington, D.C.-based writer who covers government and regulatory issues for CIO.com. Follow Kenneth on Twitter @kecorb. Follow everything from CIO.com on Twitter @CIOonline, Facebook, Google + and LinkedIn.