by Bart Perkins

AI’s dark secret? A desire for data without bounds

Dec 24, 2018
AnalyticsArtificial IntelligenceCompliance

The AI revolution is hungry for personal data. The U.S. needs to pursue federal privacy legislation before machine intelligence and surveillance intersect.

virtual eye / digital surveillance, privacy / artificial intelligence / machine learning
Credit: Vijay Patel / Getty Images

AI offers the potential to help humankind in many ways. Driverless cars and smart infrastructure hold the promise to reduce congestion by facilitating the movement of people through cities. Improved diagnosis and treatments are increasing lifespans. In the enterprise, AI can help improve hiring decisions, make the factory floor safer, automate routine tasks, produce more objective performance reviews, and help organizations understand their customers.

New tools are appearing frequently. Amazon has been granted two patents for a wrist band that tracks workers’ hand movements as they pack boxes while filling orders. The wrist bands use radio frequency to track hand movement so precisely that the bands vibrate to nudge the hands in the proper direction when inefficient movements are detected. Humanyze sells sociometric badges that track employee movement through offices to provide insights regarding the quality of interactions with colleagues. L’Oréal’s UV Sense tracks the wearer’s exposure to ultraviolet rays then transfers the data to the user’s mobile phone. Cogito monitors the empathy displayed by customer service representatives handling calls.

Unfortunately, the large amounts of data required to unlock the benefits of these tools also makes consumer and employee surveillance much easier. Launched in 2014, China’s social credit system is expected to be fully operational by 2020. The system aggregates payment history, medical information, legal records, along with other data to create an individual profile. In addition, it is widely believed that the system uses facial recognition to track where every individual travels and with whom she interacts. Cameras are so widespread in major cities that people joke that the government can find anyone in seven minutes.

On a more intimate level, We-Vib sells a bluetooth-enabled vibrator that can be controlled by a smartphone. In 2017 without admitting fault, We-Vib agreed to pay $3.75 million to settle a class-action suit that asserted that the company collected data regarding how frequently the sex toy was used and the different ways it was used.

It’s true that large amounts of data benefit society in unexpected ways. In April 2018, Joseph DeAngelo was charged as the Golden State Killer responsible for a series of unsolved murders and rapes from the 1970s and 80s. Police compared a genetic profile of the suspect with genetic profiles on websites serving individuals researching their family trees. Eventually they identified a group of people who shared genetic material linked to the perpetrator, ruling out a number of these individuals based on age, sex, and where they lived. After the police determined DeAngelo was a suspect, they obtained his DNA from trash he discarded and found it matched crime scene samples.

While these genetic databases helped the police solve decades old crimes, use of this database raises privacy concerns. Few people contributing genetic data to a repository expect the file to be reviewed by law enforcement. Contributors may not realize that their action could lead authorities to relatives who have not agreed to make genetic information public or who may not even know that family genetic material is in the database.

Privacy concerns are growing throughout the world. In May 2018, the European Union implemented comprehensive privacy protection rules, known as the General Data Protection Regulation (GDPR.) Other countries are implementing their own legislation. Canada’s National Cyber Security Strategy protects digital privacy. India’s proposed Personal Data Protection Bill emphasizes AI ethics, personal privacy, security, and transparency. It is to be overseen by an independent regulatory body with heavy penalties for violations. According to the United Nations, over one hundred countries have some data protection and privacy laws. However, the UN acknowledges that many have not been updated recently and are not strong enough to give citizens confidence that their privacy will be protected adequately.

In 2016, a number of Amazon, Apple, Facebook, Google, and Microsoft engineers, designers, and other employees grew so concerned about the potential for surveillance they created the Never Again pledge. This name refers to the role IBM technology played during World War II when punch cards helped the U.S. government manage the internment of Japanese-Americans and the Nazis track victims of the Holocaust. Never Again signers pledge to refuse to build any database that would allow the U.S. government to collect information about individuals’ religious beliefs, fearing that such database could result in mass deportations.

During the October 2018 International Conference of Data Protection and Privacy Commissioners in Brussels, Tim Cook voiced his concerns about the “data industrial complex.” He warned that individual pieces of data are being collected and synthesized to create an “enduring digital profile that lets companies know you better than you may know yourself.”

He went on support GDPR and to call for U.S. privacy legislation.

7 principles of privacy regulation

Cook is right: it is time for the U.S. to follow other countries and enact digital privacy legislation. At a minimum, the following principles should be included in new legislation.

Minimize data collected

Data collection should be limited to the data required for the task at hand. Collection of additional data that might someday be useful should be limited. Personally identifiable information should be removed from data stored for analysis.

Inform users of data gathering

Users should be informed when data is being gathered and should be told how it will be used. The individual should be able to decide whether to allow data to be collected during each activity. Browser and website terms and conditions should be simple enough to understand that the consumer can make an informed decision quickly.

Allow individuals access to their data

Individuals should be able to copy their own data and to correct or delete inaccurate personal data. Today, corrections can be almost impossible with some enterprises. A colleague recently attempted to update her credit report by removing the house next door as a prior residence. Even though she identified the merchant that changed the last digit of the street address, the credit bureau declined to remove the entry stating she had failed to prove the other house was never her residence. The credit bureau did not respond when she asked for guidance regarding what would constitute proof.

Require decision transparency

Increasingly, AI is making legal, hiring, college admissions and other decisions with a big impact on individual lives. Since AI uses such large amounts of data with so many rules, it is often impossible for a person to understand how the AI engine arrived at a conclusion. While it is not critical for a human to understand the logic used when the AI engine is playing a game, this is not acceptable in situations where bias or unintended consequences can impact the final result.

The Equivant tool predicts the likelihood that an individual will commit a new crime. Judges who use the tool during sentencing usually impose harsher terms for defendants with high scores. Unfortunately, defendants have no way to challenge their score, since the scoring mechanism is proprietary.

ProPublica, a public interest investigative journalism organization, analyzed 7,000 defendants in Broward County, Florida, in 2013 and 2014. ProPublica compared the Equivant prediction of which defendants would commit crimes in the following two years with the actual new crimes committed by each defendant. They concluded that the Equivant risk score is unreliable; it is twice as likely to incorrectly predict a black defendant will commit a future crime as a white defendant. Equivant disagrees, contending that the ProPublica methodology is flawed. Clearly, sentencing or other tool with such a large potential impact on individual lives need to be highly accurate and supported by understandable logic.

Secure the data

Every entity that collects and stores data needs make sure the data is secure and cannot be accessed inappropriately. The recent Marriott and Quora data breaches are two more in a long line of incidents that have already compromised too many people.

Enforce compliance

Active enforcement with significant penalties is critical for effective new regulations. Major technology companies make so much money that minor penalties could be ignored as the cost of doing business.

Create a national law

New U.S. privacy laws should cover all enterprises that collect personal data and not just the technology companies. In addition, federal legislation is needed or other states will follow California’s lead and create their own, probably incompatible, state laws.

Legislation’s impact on business

New privacy legislation will force some companies to reconsider their business models. Today, many consumers contribute personal data in exchange for not being charged for using search engines, social media, and other tools. Consumers concerned about privacy might prefer to pay a small fee in exchange for true anonymity. Although it is theoretically possible to browse the web today without leaving any trace, most consumers find it impossible without highly skilled technical help.

Careful planning and some compromises will be required in the workplace to balance businesses need to deploy new tools with employee desire to maintain personal privacy. New legislation could create a situation in which employees have to agree to have their data collected. Depending on the language in new legislation, many existing tools could produce what would be considered personal information allowing employees to decline to be tracked thereby making tools useless.

Compliance with new legislation will undoubtedly force companies to handle employee data with a greater care. Supervisor access could be limited to employee data that is directly related to the individual’s performance with analysis tools run on anonymized data only. Well-run enterprises could offer independent coaching to employees who want to improve their performance.

AI provides many benefits and will provide other benefits we cannot envision at this time. As with all new technologies they can be used to benefit humankind or they can be abused. When we figure out how to ensure the data collected is not used for inappropriate surveillance, perhaps Scott McNealy’s 1999 assertion, “You have zero privacy anyway. Get over it,” will no longer be true.