How Big Data Will Save Your Life

Dr. Robert Walker, director of health innovation for the U.S. Army Surgeon General, has been more a frustrated data entry clerk in recent years than a physician, a frustration shared by thousands of his colleagues.

Dr. Robert Walker, director of health innovation for the U.S. Army Surgeon General, has been more a frustrated data entry clerk in recent years than a physician, a frustration shared by thousands of his colleagues.

Instead of freeing him for more face-time with patients, the electronic health record (EHR) system he uses has become a third person in the exam room, drawing his attention away from patients. The issue isn't the EHR Walker uses, however; it's the shortcomings of technology in general.

"The electronic medical record has become an impediment versus something that was going to streamline your day," Walker explained in a recent interview. "It took the focus away from the patient and put it all on the computer. People are clicking boxes and turning their backs to patients. It's all about jamming data into this thing."

EHRs makes it possible for every medical care facility to electronically capture a patient's family history, illnesses, treatments and current lifestyle. The promise of EHRs was that they would save the U.S. healthcare system up to $81 billion a year by streamlining workflows and creating massive clinical data warehouses that could be mined for information that could improve preventative care and disease treatment.

That has not yet happened, and doctors are less enamored with EHRs as a result. Last month, the American College of Physicians and AmericanEHR Partners released a survey of 4,279 physicians that showed fully 39% of them would not recommend their EHR to a colleague. That's up from 24% who felt that way in 2010. And 34% said they are "very dissatisfied" with the ability of EHRs to decrease workload.

Under the auspices of the Health Information Technology for Economic and Clinical Health Act, (HITECH Act), the U.S. government is requiring healthcare providers -- hospitals, clinics and private practices alike -- to implement EHRs. Providers must also prove their meaningful use of those systems through a three-stage government process that is taking place over the next four years.

Despite what has so far been an uneven rollout of EHRs in the U.S., Walker and others are already, in effect, building what a treasure trove of patient information that can be tapped to improve patient care, a repository that will revolutionize medicine for decades to come. That is, if everyone can figure out how to categorize it, sort it and access it easily.

The promise

Big data analytics engines such as Hadoop have the capability to mine the clinical data warehouses created by EHRs, warehouses filled with valuable unstructured data that can be used to help doctors make decisions about patient treatment.

Today, physicians and pharmaceutical companies still rely largely on text books and infinitesimally small clinical studies that typically use healthy patients with only one disease. That pool of subjects hardly mimics most real-world patients, many of whom have more than one health problem.

About 25% of hospitals use some form of data analytics to mine traditional databases to learn more about past treatments and about how future treatments can be improved. But, what is contained in the columns and rows of databases represents an almost insignificant portion of the information about patients that's been collected; the most important information lies in unstructured data - the physicians' notes, radiological images and lifestyle information gathered from patients using mobile devices.

"That's the real renaissance that's going to happen in health care," Walker said. "With big data, what happens in a doctor's office is going to be vastly different from what we see today. The top five or 10 things that people die from in America are life-style induced. That's absurd. Maybe instead of vital signs, I'm just going to look at what you buy in a grocery store."

Today, data analytics in most hospitals is used to manage costs and increase the quality of care. The more promising use for big data, however, is the ability to discover treatment-and-outcome correlations using physician and nurse notes and data driven by genetic profiles.

By combining big data and genetics analytics, scientists today can determine how a patient will react to a medication and may someday even be able to predict who may become ill and -- if they do -- what customized medications can best treat diseases.

"When I look at the historical growth rate, [big data] is definitely a hot application in the marketplace," said James Gaston, senior director of clinical and business intelligence at the Healthcare Information and Management Systems Society (HIMSS).

Personalized medicine

Currently, one of the more promising areas of big data analytics involves drug therapies devised through the study of genomics, also known as personalized medicine.

Genetic diseases are akin to buggy code in software; the key to finding the cause of an illness is to uncover that error in the code, according to Alexis Borisy, co-founder of Foundation Medicine, a cancer diagnostics company.

"Cancer, for example, is a disease of the genome where something has gone wrong with the programming code and a mutation occurred. There are actual errors in the code and that's a core reason why cancer develops," Borisy said.

While sequencing the first human genome took eight years and cost about $1 billion, genetic sequencing costs have fallen dramatically in the last decade. It now costs from $5,000 to $10,000 per human genome, and companies are working hard to cut that cost to $1,000 in the next few years. Sequencing a DNA strand is becoming so inexpensive that hospitals will soon be able to do it for on most patients and add the data to an EHR, according to according to Nigam Shah, an assistant professor of Medicine at Stanford University's School of Medicine.

Shah works in biomedical informatics, meaning he works toward making sense of the information in clinical data warehouses.

Sequencing of a human genome yields a massive amount of data, and storing one person's genetic code can require up to 1TB of data storage capacity, Shah said.

The human genome contains 3.2 billion lines of code, which means that finding a flaw in that code requires sophisticated computer algorithms and massive, clustered server farms. Adding to the complexity is that disease is often the result of multiple mutations, according to Shah.

While diseases such as Huntington's or Alzheimer's disease are caused by common genetic mutations, and are more easily spotted, most illnesses are caused by rare mutations. Diabetes, for example, is thought to be caused by a number of genetic mutations, which on their own confer a small amount of risk, but in combination can be more serious.

"If you genome type someone, and out of the 50 [mutations associated with diabetes] you have 10 of them, it's very hard to say what's going to happen to you," Shah said. "Part of the problem is that we just need to do more research and collect more data, and some of it we just need better methods."

But tremendous progress has been made. To date, scientists now know the genetic causes of about 5,000 rare diseases.

One of the most promising areas of genetic research is pharmacogenomics, which uses a person's genetic makeup to determine how they'll respond to drugs, tailoring treatments to specific mutations -- even mutations found in cancer tumors.

For example, the drug Zelboraf was developed by New York University's Cancer Institute a couple of years ago through genetic tests to target melanoma skin cancer tumors that express a gene mutation called BRAF V600E. Researchers found patients taking Zelboraf were 64% less likely to die from the advanced form of skin cancer than patients who received only standard chemotherapy.

"Looking at your genome does help in saying, 'For you, we should give half the dose of this drug, but for this other person we'll give you a double dose of that drug,'" Shah said.

Linking EHRs with genomes

Currently, there are several projects underway to link EHRs and human genomic data. Among the most promising is the Electronic Medical Records and Genomics (eMERGE) Network.

Funded by the National Human Genome Research Institute, the eMERGE network joins researchers from nine healthcare research organizations and hospitals with a wide range of expertise in genomics, statistics, ethics, informatics and clinical medicine. Up to 10,000 patients will have sequencing performed on them in reference to 83 specific genes, with another 50,000 to 80,000 patients getting more general genotypes.

The resulting data will improve genetic risk assessment, disease prevention, diagnosis and treatment, and can be used to develop genomic-based medicines, according to Dr. Gail Jarvik, head of the division of Medical Genetics at the University of Washington.

The eMERGE network includes the University of Washington, the Mayo Clinic, Boston Children's Hospital and the Geisinger Health System. The network started out looking for genes for more common diseases, using computer algorithms with EHRs to find the diseases associated with a particular genotype.

"This year, the network moved into pharmacogenetics, and it is very interested in sequencing of genes related to treatment response or adverse response to medications," Jarvik said.

Jarvik, one of the network's principal investigators, said the network has been successful in finding disease genes, immunity genes, and eye and cardiac disorders.

The eMERGE project has developed a computer algorithm that extracts disease types from a number of different EHRs at various institutions. Researchers then input the data and look for genetic markers that point to mutations responsible for diseases.

"When you move to pharmacogenetics, there are problems you can have with drugs," Jarvik said. "A drug can be ineffective, or you may have an effective use of that drug but you may need a different dose than someone else. Or you might have a bad reaction. We want to work on all those problems."

Shah and other researchers caution that many variables affect a person's health, and genomics won't be a cure-all. But the use of big analyses can help improve patient outcomes.

Notes, images and biometrics

Genomics is only "one tiny fraction" of the myriad efforts to improve healthcare, Shah said. "For the average Joe who has hypertension, diabetes [and] high cholesterol, genomics is completely useless."

One of the most valuable tools in diagnosing and tracking patients still involves medical notes, and new natural language processing software is allowing those physician's notes to be codified into database fields that most healthcare professionals don't have time to fill out themselves.

"Textural notes are how doctors communicate with other healthcare providers about what's going on with a patient, what's the plan for treatment and what are the concerns," said Dr. Isaac S Kohane, a professor of pediatrics and health sciences technology at Harvard Medical School & Children's Hospital.

Kohane is frustrated that it's easier to find out more about shoppers' experiences with a digital camera purchase than to determine what adverse events patients had with a particular drug. So, along with several colleagues, Kohane developed free open source software called i2b2 informatics that can collect both physician notes and other unstructured data as well as codified medical data from a patient's bedside.

The informatics platform is used by more than 100 academic health centers around the world. It has been used to pinpoint genetic predictors for diseases such as rheumatoid arthritis and to identify harmful drugs.

For example, the informatics engine revealed that there was a higher risk of heart attack from the drug Avandia than from other drugs in the same class.

When the i2b2 software was deployed in hospital emergency rooms, it was able to predict, on average, two years in advance of the typical healthcare system whether a patient was suffering from domestic abuse by detecting physical traits, Kohane said.

"At the same time, this is almost like a back door. The data is being offloaded and analyzed [after the fact]. What about real-time care of patients across healthcare systems?" he said.

In chronic care, what matters most is that a doctor be able to access clinical data warehouses that contain information on thousands similar patients.

"What matters is the ability for the doctor to say you have these four diseases and you're taking these four drugs, here are the results of treating these other similar patients," Shah said. "There is no clinical trial that has every looked at these four diseases and the effect of these four drugs."

When data from EHRs can be exchanged seamlessly, a physician will be able to query what thousands of other doctors did in the same situation.

"Then I want to ask myself, 'What am I worried about with this person: Am I worried about blood clots or heart attack?" Shah said. "Then I can query what happened to the 1,000 other people who suffered a blood clot and determine ... that outcome in those people very similar to you."

"It's sort of like doing a clinical trial in silicon," Shah continued. "I refer to this whole process as practice-based medicine."

Historically, medicine has relied on published guidelines for treatment or the results of clinical trials for drug prescriptions, which always focus on one disease and most often use only younger, healthier patients as subjects for tests.

Data pigeon holes

1 2 Page
Join the discussion
Be the first to comment on this article. Our Commenting Policies