by Mark MacCarthy

AI-driven content moderation can never be perfect

Oct 26, 2018
Artificial IntelligenceInternetLegal

Regulatory pressure to rely more on automated takedowns threatens to undermine the legitimacy of online spaces.

At this week’s panel discussion in Brussels hosted by the Center for Democracy and Technology, I discussed the capacities and limitations of AI-driven content moderation techniques. Other panelists included Armineh Nourbakhsh now with S & P Global, who discussed the Tracer tool she helped develop while at Reuters and Emma Llansó with CDT, who discussed the recent paper she co-authored, “Mixed Messages? The Limits of Automated Social Media Content Analysis.” Prabhat Agarwal from the European Commission provided an insightful policymaker perspective.

Platforms and other Internet participants use automated procedures to take action against illegal material on their systems such as hate speech or terrorist material. They also use the same techniques to enforce their own voluntary terms of service that they require of users in order to maintain stable and attractive online environments. Finally, there is an intermediate type of content which is not just unattractive and is not quite illegal in itself but which is harmful and needs to be controlled. Disinformation campaigns fall in this category.

The government pressure to use automated systems is growing

The reason for discussing automatic take downs of this material right now is not just technical. Policymakers are very concerned about the effectiveness of the systems used for content moderation and are pushing platforms to do increasingly more. Sometimes this spills over into regulatory requirements.

The recently-announced EU proposed regulation to prevent terrorist content online is a good example. Much attention focused on its proposal to allow competent national authorities in the member states to require platforms to remove specific instances of terrorist content within an hour of being notified. But much more worrisome is its requirement that platforms put in place “pro-active measures” designed to prevent terrorist material from appearing on the systems in the first place. Since pro-active measures are automatic blocking systems for terrorist content, the proposed regulation explicitly creates a derogation from the current ecommerce directive to impose a duty to monitor systems for terrorist content. Still worse, if a company were to receive too many removal orders and could not reach agreement with the regulator on a design improvement, the company could be required to implement and maintain an automatic takedown system prescribed by the regulator.

Protective measures are needed to make removal decisions fair

There must be explicit standards used for the removal decision and those standards need to be transparent so that users can form expectations about what is out of bounds and what is acceptable. Moreover, there needs to be an explanation for the individual removal decisions that describes the specific features or aspect of the content that triggered the removal. Finally, because no system can be perfect, there needs to be a redress mechanism to allow material to be restored when it has been removed in error.  As a model, look to the practices of the credit industry in the U.S., which has had these elements of transparent, explanation and redress for generations.

For reasons that Emma Llansó outlined in her paper, it is a mistake to rely solely on automated systems for removals. The error rate is simply too high, and human review is always needed to ensure that the context and meaning of the content is fully considered prior to removal.

There is an affirmative and useful role for these automated systems

First, they can be used for very specific and valuable purposes. The Tracer system, developed by Reuters and described by Armineh, scanned Twitter to detect breaking news stories and passed them on to Reuters editors and journalists for review. It rejected most, but not all, of the false rumors and deliberate attempts to mislead and passed on only material highly likely to be newsworthy, thereby allowing journalists to do their job much more efficiently. Even here, however, protections are needed to avoid, for instance, profiling of particular sources or users.

A second important use is screening material for possible removal. When systems operate at the scale of modern social networks, human review of all potentially harmful material is simply impossible. The automated systems can provide a manageable list of problematic content for human reviewer determination of removal.

Finally, automatic systems can be very effective in preventing the uploading of material that has previously identified and removed. This model is effectively used for child pornography images and through an automated content ID system for preventing the unauthorized uploading of copyrighted material.

The way forward is through public private partnerships

The concerns about automatic takedowns are not meant to undermine the general thrust of the movement to remove dangerous material from online systems, which in light of the very serious challenges of terrorism, is a matter of the utmost urgency. However, we must look to the future as well as to our immediate needs and should be careful to build in safeguards that would prevent the abuse of any new regulatory authority in very different political circumstances, where the risk might be the overreach of government agencies.

The way forward is to recognize that problem is larger than tech. Terrorist events emerge from underlying social, political and economic processes that the technology industry did not create and cannot on its own ameliorate. But their systems should not be used to make the problem worse and their removal efforts would benefit from public private partnerships that rely on government for information and referrals and for convening industry players to develop and implement industry-wide best practices.

A significant danger, however, is that policymakers appear willing to disregard the real costs of errors, apparently thinking mistakes are worth it as long as we succeed in removal the dangerous material. After all, they seem to think, platforms can always restore material removed in error.

This perspective ignores the timeliness of much online communication, where restoration weeks or even days later is pointless. Moreover, the scale of the errors involved can easily be underestimated. A small error rate among billions of uses can mean not hundreds, not thousands, but hundreds of millions of mistakes. Taken together and distributed over time, errors of this magnitude would undermine the credibility and legitimacy of these systems, losing their manifold benefits for legitimate political discourse.