IRS combats fraud with advanced data analytics

The Internal Revenue Service deals with billions of dollars in identity theft tax refund fraud every year. Anomaly detection powered by advanced analytics is helping to combat it.

IRS combats fraud with advanced data analytics
Thinkstock

The Government Accountability Office (GAO) estimates that criminals attempted at least $14 billion in identity theft tax refund fraud in 2015, and the Internal Revenue Service (IRS) paid out at least $2.24 billion on that amount. To combat this fraud and protect taxpayers, the IRS has turned to advanced analytics.

The IRS began developing the Return Review Program (RRP) in 2009 to replace a system that was no longer capable of keeping pace with the increasing levels and sophistication of fraud that the service faced, as well as the service’s evolving compliance needs.

"Our Return Review Program is the service's primary system for detecting identity theft and pre-refund fraud in the tax system," says Michael Cockrell, director of data delivery services for IRS IT Applications Development. "Think of RRP as an engine that sits within the tax processing pipeline, scanning for potential identity theft and fraud."

The scale of data the IRS deals with is enormous. It collected nearly $3.5 trillion in gross taxes in fiscal 2018 and issued more than 122 million refunds, amounting to nearly $464 billion. According to the IRS, tax noncompliance, including refund fraud, threatens the integrity of the tax system and increases the tax burden on honest citizens. Identity theft and tax refund fraud techniques have become increasingly sophisticated over the years, requiring the service to develop an advanced response to ensure tax refunds are protected from fraudsters.

RRP, which has earned the IRS a CIO 100 Award in IT Excellence, uses predictive fraud and non-compliance detection techniques and models that seek out subtle data patterns to determine the reliability of return data, including a filer's identity. Past and current taxpayer returns, as well as data from businesses, partnerships, non-profits, government employers, estates, trusts, and whistleblowers, are analyzed to assign scores to returns that involve refunds, based on characteristics of identity theft and other refund fraud.

To continue reading this article register now

Get the best of CIO ... delivered. Sign up for our FREE email newsletters!