Big data analytics and NLP: How health plans can make more money -- and keep it

Natural language processing is an emerging area that can help unlock value from the vast stores of unstructured data that account for as much as 80% of all clinical data. UPMC Health Plan does just that.

big data analytics research chart growth [Thinkstock-516976188] [HOLD - CW August 2016]
Credit: Thinkstock

Big data analytics in healthcare has largely been about looking at claims, electronic health records (EHR) and other forms of structured data. Natural language processing (NLP) is an emerging area that can help unlock value from the vast amounts of unstructured data that are pervasive in healthcare. In the emerging era of value-based payments, risk adjustments may well determine the difference between profit and loss for the health insurance industry.

UPMC Health Plan, the health insurance arm of the University of Pittsburgh Medical Center (UPMC), has deployed NLP-based technology and big data analytics to efficiently process millions of pieces of documentation to accurately identify risk adjustment possibilities and capture incremental revenue.

Risk adjustment — the money on the table for health plans

Under the risk adjustment program for Medicare Advantage, the Centers for Medicare and Medicaid Services (CMS) adjusts reimbursement amounts based on risk scores that take into account a variety of conditions. The purpose is to adequately cover the costs of providing healthcare, especially to those with complex conditions. Under this program, health plans can increase revenues by submitting documentation from doctor-patient interactions that justifies the risk adjustment. However, they often fail to capitalize on this opportunity for a variety of reasons. The result is a potential loss of revenue for the health plan.

Despite the billions of dollars spent on digitizing medical records under the so-called meaningful use provisions of the HITECH Act, the vast majority of clinical data (estimated widely to be around 80%) is in the form of unstructured data, such as clinical notes, audio transcripts, images and so on. Since unstructured data can support risk adjustment claims, big data and NLP technologies can be used to parse the information in these types of documentation for evidence of incremental risk that can qualify for additional payments.

Health plans typically have large teams of certified coders (people trained on the International Classification of Diseases or ICD-10) who review claims under the risk adjustment program. Given the 9,000 or so ICD-10 codes that map to some 79 CMS hierarchical condition categories (HCC), there are simply too many combinations for humans to handle. Besides being very labor-intensive, the process can be very expensive and error-prone as well. Besides, it is well near impossible to “brute-force” a way through all of the unstructured data sitting in millions and millions of documents to unearth evidence for risk adjustment. 

Big data and NLP for the unlocking of value in unstructured data

With the emergence of NLP tools, it is now possible to process millions of documents much more efficiently and identify opportunities much more accurately.

By partnering with Health Fidelity, a Silicon Valley startup that has built a big data analytics platform based on the NLP technology, UPMC obtains insightful and accurate coding suggestions for risk adjustment from vast amounts of unstructured data such as clinical notes. UPMC has also made a strategic investment in the company, which has licensed the NLP technology from Columbia University, where it was originally developed. According to John Wisniewski, chief actuary at UPMC Health Plan, the technology has allowed the organization to "make more money — and keep it, which allows us to keep our premiums low." Using techniques such as machine learning and association mining, the NLP platform presents coders with suggestions for possible risk-adjustments based on standard terminology and robust taxonomies that identify and understand patterns.

The ROI on the use of NLP technology comes about from improved productivity as well as increased reimbursements. The financial returns to UPMC Health Plan have been in the range of $40 million a year for the two years it has used the technology.

Augmenting human expertise, improving productivity

NLP technology and big data analytics are a means of augmenting — not replacing — human knowledge. Experienced coders take the recommendations from the platform and check them for accuracy before modifying claims to include additional information. The technology and tools thus complement the work of expert coders. By identifying and helping prioritize claims for review, and doing it in a fraction of the time it used to take previously, UPMC's coders have been able to increase throughput by a factor of 4, leading to accelerated cash flows and increased revenues.

An important aspect of the improved accuracy is the avoidance of penalties that might arise from CMS risk adjusted data validation (RADV) audits, which are intended to identify and recover improper payments. UPMC's NLP platform can "remember" where to find supporting documentation for any claim, years after the claim was submitted and reimbursed. The platform reduces effort and costs associated with the cumbersome process of going through the increasingly frequent audits.

A final, important aspect of the use of NLP is the opportunity to improve HEDIS scores by identifying gaps in care that can help improve health outcomes.

NLP technology is rapidly gaining ground as a technology that can recognize and analyze "human" commands as opposed to machine language or programming (think Siri or Alexa). Through machine learning and artificial intelligence, NLP technologies can improve in accuracy over time.

Many health plans are turning to big data analytics on unstructured data using techniques such as NLP to process vast data sets efficiently and identify hidden revenue opportunities. Those that are not getting on the bandwagon are likely to be busy writing refund checks to the CMS for improper payments.

This article is published as part of the IDG Contributor Network. Want to Join?

To comment on this article and other CIO content, visit us on Facebook, LinkedIn or Twitter.
Download the State of the CIO 2016 report