IRS combats fraud with advanced data analytics

The Internal Revenue Service deals with billions of dollars in identity theft tax refund fraud every year. Anomaly detection powered by advanced analytics is helping to combat it.

IRS combats fraud with advanced data analytics
Thinkstock

The Government Accountability Office (GAO) estimates that criminals attempted at least $14 billion in identity theft tax refund fraud in 2015, and the Internal Revenue Service (IRS) paid out at least $2.24 billion on that amount. To combat this fraud and protect taxpayers, the IRS has turned to advanced analytics.

The IRS began developing the Return Review Program (RRP) in 2009 to replace a system that was no longer capable of keeping pace with the increasing levels and sophistication of fraud that the service faced, as well as the service’s evolving compliance needs.

"Our Return Review Program is the service's primary system for detecting identity theft and pre-refund fraud in the tax system," says Michael Cockrell, director of data delivery services for IRS IT Applications Development. "Think of RRP as an engine that sits within the tax processing pipeline, scanning for potential identity theft and fraud."

The scale of data the IRS deals with is enormous. It collected nearly $3.5 trillion in gross taxes in fiscal 2018 and issued more than 122 million refunds, amounting to nearly $464 billion. According to the IRS, tax noncompliance, including refund fraud, threatens the integrity of the tax system and increases the tax burden on honest citizens. Identity theft and tax refund fraud techniques have become increasingly sophisticated over the years, requiring the service to develop an advanced response to ensure tax refunds are protected from fraudsters.

RRP, which has earned the IRS a CIO 100 Award in IT Excellence, uses predictive fraud and non-compliance detection techniques and models that seek out subtle data patterns to determine the reliability of return data, including a filer's identity. Past and current taxpayer returns, as well as data from businesses, partnerships, non-profits, government employers, estates, trusts, and whistleblowers, are analyzed to assign scores to returns that involve refunds, based on characteristics of identity theft and other refund fraud.

"As taxpayers submit their returns, whether it's electronic or paper, all of that will get fed through the tax processing pipeline, and RRP looks at all these returns and specifically those that have a refund. So, we load all this return data, but we run our analytics against those returns that have a refund and we flag those things that come out as either identity theft or potential fraud," Cockrell says.

Cockrell adds that RRP uses clustering, sometimes referred to as linked return analysis, to look at returns over time, seeking elements that might be shared and thereby indicate that something connects one fraudulent return to others. That enables the service to pinpoint whether a fraudulent return is a single instance or linked to something bigger.

"Fraudsters have access to more and more taxpayer data. In an electronic world, they can pretty much mimic a taxpayer. Things that we used to think were sensitive, that only we knew about, that isn't the case. They have access to those data through breaches that have happened over time and through data from other sources," Cockrell explains.

The IRS gets agile

Because fraudsters have so many ways of getting the real data of taxpayers in order to mimic them, and because the schemes they employ are getting ever more sophisticated, Cockrell's team relies on business partners in the service's Wage and Investment Division, Criminal Investigation Division, and Research, Applied Analytics, and Statistics Division to analyze data that's been captured over time and identify new schemes. These partners helped Cockrell's team identify the breadth of business and technical requirements. The team also leveraged external experts from industry partners to help define the capabilities, system requirements, and architecture necessary to ensure RRP could continue to adjust to evolving threats and scale to meet the increased processing demands during the tax filing season.

"The fraudsters are becoming more and more sophisticated and are trying more and more things," says Linda Gilpin, associate CIO of the IRS IT Enterprise Program Management Office. "One of the key challenges is staying ahead of them and not letting them win, in the sense of coming up with sophisticated schemes that we can't catch. We have to be on top of that. And that constant evolution is one of the key challenges that make this program so valuable."

To get there, Cockrell says the service, which has always prided itself on its ability to plan for and execute big, multi-year releases, had to learn to focus on smaller increments of well-defined features and capabilities — an agile approach. It adopted an agile development approach two years ago, which required it to change how it plans software updates and to train staff and business customers in the new approach.

"In the early years of RRP we had this big challenge, and I think the initial reaction was to swing for the fences," Cockrell says. "We're really good at defining multi-year plans and targeting big, multi-year releases. But I think what has become more apparent to us now is being more agile is a better path, especially given the speed of technology and the increased sophistication of fraudsters. Agile development allows us to work with our business customers to define smaller increments of functionality that's targeting specific, high-priority needs today. It's helping us define things and deliver soon, and to be in a better position in combatting fraud."

Using RRP, the IRS says it has been able to better protect federal revenues and the tax refunds of millions of Americans. A GAO report estimates that RRP prevented the issuance of more than $6.51 billion in invalid refunds between January 2015 and November 2017. As a result, RRP has become a key component of the service's modernization efforts.

"Beyond the actual dollar value, which is huge, there is the value that all taxpayers have: knowing that we are very focused on making sure that people pay their fair share and that fraud isn't allowed to go rampant unchecked," Gilpin says. "The honest tax payers are the vast majority, and they want to know that."

Copyright © 2019 IDG Communications, Inc.

Survey says! Share your insights in our 2020 CIO Tech Poll.