We’re used to algorithms recommending books, movies, music and websites. Algorithms also trade stocks and predict crime, identify diabetics and monitor sleep apnea, find dates (and babysitters), calculate routes and assess your driving, and even build other algorithms. These math equations, which can reach thousands of pages of code and routinely crunch hundreds of variables, may someday run our lives. Companies increasingly use them to run the digital business and gain competitive advantage.
Unleashing an algorithm can lead to new customers and revenue, but it can also bring encounters with ethical and legal trouble. Already, consumer advocates and regulators are training their sights on the dark side of the algorithm revolution, such as creepy over-personalization and the potential for illegal price discrimination.
As CEOs look to chief digital officers and data scientists to conquer the next frontier, CIOs have sometimes been on the sidelines, whether by choice or default. But as business leaders, CIOs may now have to elbow into meetings where Ph.D.s, corporate lawyers and other colleagues are talking about the data-driven future. CIOs need to join those conversations to help steer company strategy, certainly, but also to contribute to decisions about what data to pour into an algorithm and what to keep out, and how to monitor what the algorithm does.
That includes devising a defensible policy for handling the information produced, says Frank Pasquale, a professor of law at the University of Maryland. “Algorithmic accountability” will become part of the IT leader’s job, he says.
[ Slideshow: 8 Analytics Trends to Watch in 2015 ]
That realization can hurt. Athena Capital Research, a high-frequency stock trader, used a proprietary algorithm called Gravy to slip in big buy and sell orders milliseconds before the NASDAQ exchange closed for the day in order to push stock prices higher or lower, to Athena’s advantage. The Securities and Exchange Commission viewed that as illegal manipulation and last year called out Athena’s CTO for helping other managers plot the most effective use of Gravy during at least six months in 2009. Athena settled the case for $1 million.
No one says CIOs must delve into Ph.D.-level math. But a working knowledge of basic concepts behind algorithms can help avoid bad results and bad press. “Algorithms allow us to get rid of biases we thought were there in human decision-making,” says Michael Luca, an assistant professor at Harvard Business School. “But pitfalls are equally important to think about.”
Algorithms can be used to make operations more efficient, answer “what if” questions and make new products and services possible. At United Parcel Service, the 1,000-page Orion algorithm does all of that. In 2003, UPS started building Orion (for On-Road Integrated Optimization and Navigation) to optimize delivery routes. You might have six errands to do on a given day. A UPS driver has about 120. The company wanted to save time and fuel by having drivers follow the most efficient routes possible while still making deliveries on time, says Jack Levis, director of process management. Levis oversees Orion and the team of 700 engineers, mathematicians and others who support it.
[Related: Analytics App Uncovers Untapped Sales Opportunities ]
Cutting just one mile per driver per day saves $50 million per year, Levis says, and Orion has so far saved seven to eight miles per driver per day. UPS is on track to save $300 million to $400 million per year in gas and other costs by 2017.
The most important thing any manager can do when embarking on an algorithm project is to “work backwards,” Levis says. That is, define carefully what business decisions the company struggles with, then identify what knowledge would help–what information you’d need to teach you the knowledge you lack. Then identify the raw data that–when combined and teased apart and interpreted–would provide that information.
UPS spent nine years working on Orion before putting it into production, adding and subtracting data, testing, then adding and subtracting again. For example, at first Orion used publicly available maps. But they weren’t detailed enough. So UPS drew its own, showing features such as a customer’s half-mile driveway or a back alley that shaves time getting to a receiving dock–data points that Orion needs in order to plan how to get a package delivered by 10:30 a.m.
But an algorithm created by data scientists in a laboratory can’t anticipate every factor or account for every nuance. Suppose a business customer typically receives one package per day. If Orion knows the package isn’t tied to a certain delivery time, the algorithm might suggest dropping it off in the morning one day but in the afternoon the next, depending on the day’s tasks. That might be the most efficient approach for UPS, but customers wouldn’t know what to expect if delivery times changed frequently.
People don’t like that amount of uncertainty, and it might have cost UPS business. Companies often take deliveries in the morning, go about their business during the day and then call UPS back to request a late-afternoon pickup of an outgoing package. If UPS pushed deliveries to the end of the day for efficiency’s sake, it might not get that later call, Levis says. “We started realizing the rules we told the algorithm weren’t as good as they should have been,” he says. “We’ve learned you need to balance optimality with consistency.”
The Orion team is outside IT, but Levis says the IT group built the production version of Orion and CIO Dave Barnes understands what Orion can and can’t do, which is critical when he helps UPS devise business strategy. UPS’s My Choice service, which notifies customers of pending deliveries and lets them change delivery times or locations, wouldn’t be workable without Orion, Levis says. Not only does My Choice reduce multiple delivery attempts, it also brings in new revenue: 7 million customers have signed up for the service and pay $5 per change or $40 per year for unlimited changes. Next, UPS wants to bring it to other countries.
To grow new business from algorithmic insights, companies must look for correlations that competitors haven’t spotted.
Take H&R Block, for example. In December, executives at the provider of tax filing software and services talked in detail with financial analysts about the company’s new algorithm, which tailors marketing email messages and in-software pop-up boxes to individual customers. The company rolled it out this tax season, after starting algorithmic tests to quantify and categorize the behavior of 8,700 tax filers in an effort to predict what customers will do.
CMO Kathy Collins discussed how, for example, H&R Block may know that, based on past behavior, you’re typically a February filer who prefers to interact with the company via mobile device. If you haven’t filed by Feb. 10, the algorithm will suggest that someone nudge you with an email reminder and a discount on help preparing your return. Other customers may receive an email offer the week they receive their W-2 forms.
Over time, H&R Block expects to improve its algorithm by analyzing not only the content of customer tax returns but also the very clicks a taxpayer makes while using its software, said Jason Houseworth, president of global digital and product management. “In our case,” he said, “the personal data is as rich as it gets,”
The personalization made possible through the algorithm, Houseworth said, “will make each user feel that the software was not only designed for them, but is always a step ahead.”
Some customers may like that, but others won’t, says Pasquale, who wrote The Black Box Society: The Secret Algorithms That Control Money and Information. “There’s so much pressure to know more. That’s the arms race I fear.”
The idea of knowing more about people is a driving force at eHarmony. The dating service matches members by their self-identified characteristics, such as hobbies and sexual orientation. But eHarmony also extrapolates what it calls unstated “deep psychological traits,” such as curiosity, by putting answers to questionnaires through various formulas. A neural network also produces a “satisfaction estimator” to rate potential pairings, and the system learns over time, as members report back about their satisfaction with matches eHarmony suggests.
The company doesn’t have a CIO; COO Armen Avedissian handles that role. Decisions about whether to change the algorithm are made by a team that includes Avedissian, CTO Thod Nguyen, vice president of matching Steve Carter and corporate lawyers. “It’s not just hardware and software but the tactics and strategy of data manipulation,” Avedissian says.
The company looks at 29 dimensions of compatibility, such as “emotional energy” and “curiosity,” each of which incorporates several variables collected through detailed questionnaires. More than 125TB of data is involved. The algorithm learns by assessing what a member does with each match that eHarmony suggests (contact right away? ignore?) as well as what feedback the members provide in questionnaires and open-ended responses. That data gets poured back into the equation and the cycle starts again, more informed, Avedissian says.
The more relevant the matches, the higher the rate at which members will communicate with each other. The more they engage, the more likely they are to buy annual subscriptions. All the algorithms at eHarmony are intended to convert registrants into subscribers.
The dating service tests ideas by running slightly different algorithms for different customers, then measuring the rate at which registered members convert to annual subscribers. Risk and compliance teams run their own algorithms to see how the company’s other algorithms are using sensitive data.
One recent discovery: Whether someone smokes and drinks turns out to be more important in dating in Europe compared to the United States. Once eHarmony more heavily weighted the smoking and drinking variables in its matching algorithm in the U.K., “business just took off,” Avedissian says. Meaning, suggested matches were more on-target, therefore satisfaction increased–and so did conversion rates.
However, not all outcomes are expected.
Uber is upending the taxi business with an app to connect passengers with rides and a proprietary algorithm that, in part, governs “surge pricing,” which raises fares at times of heavy demand. Taxi associations from New York to Paris and back have protested Uber for cutting into their business, and government regulators have challenged the company on questions of fair pricing and safety. Even so, the darling of disruption has raked in an estimated $4.9 billion in investor funding.
In December, the cold, hard math collided with high emotion: Uber’s algorithm automatically jacked up rates in Sydney, Australia, as people tried to get away from a downtown café where an armed man held 17 people hostage. Three people, including the gunman, died. Uber later apologized for raising fares, which reportedly were up to quadruple normal rates, and made refunds. “It’s unfortunate that the perception is that Uber did something against the interests of the public,” a local Uber manager said in a blog post. “We certainly did not intend to.”
Problems are most likely to arise when algorithms make things happen automatically, without human intervention or oversight. Control is critical, says Alistair Croll, a consultant and author of Lean Analytics: Use Data to Build a Better Startup Faster. “If algorithms are how you run your business and you haven’t figured out how to regulate your algorithms,” he says, “then by definition you’re losing control of your business.”
Uber is working on a global policy to cap prices in times of disaster or emergency, a spokeswoman says.
Other unintended consequences involve the liability of knowing too much.
For example, say a hospital uses patient data to identify people who may be headed toward an illness, then calls them to schedule preventive care. If the math is imperfect, the hospital might overlook someone who later contracts an illness or dies. Or a whole group of people could get overlooked. “There’s concern about who are the winners and losers and can the company stand by it later, when exposed,” Pasquale says.
In another scenario, a company could open itself up to discrimination claims if it keeps too much data and insights about its employees, he says. Someone might be able to prove the company knew about, say, a health condition before letting him go.
Or if a car insurance company discovers there’s a higher chance a customer will get into a crash after driving a certain number of miles, it may find itself in a “duty to warn” situation, Pasquale says. That’s when a party is legally obligated to warn others of a potential hazard that they otherwise couldn’t know about. It usually applies to manufacturers in product liability cases, or to mental health professionals in situations involving dangerous patients. And as the use of revelation-producing algorithms spreads, Pasquale says, people in other sectors could be subject to a similar standard–at least ethically, if not legally.
“At what point will things be a liability for you by knowing too much about your customers?” he asks.
Sometimes companies don’t set out to uncover uncomfortable truths. They just happen upon them.
Insurance company executives, for example, should think carefully about results that could emerge from algorithms that help with policy decisions, says Croll, the consultant and author. That’s true even when a formula looks at metadata — descriptions of customer data, not the data itself. For example, an algorithm could find that families of customers who had changed their first names were more likely to file claims for suicide, he speculates. Further analysis could conclude that it is likely those customers were transgender people who couldn’t cope with their changes.
An algorithm that identified that pattern would have uncovered a financially valuable piece of information. But if it then suggested that an insurer turn down or charge higher premiums to applicants who had changed their first names, the company might appear to be guilty of discrimination if it did so, Croll says.
The CIO’s Best Role
The best way a CIO can support data science is to choose technologies and processes that keep data clean, current and available, says Chris Pouliot, vice president of data science at Lyft, a competitor of Uber. Before joining Lyft in 2013, Pouliot was director of algorithms and analytics at Netflix for five years and a statistician at Google.
CIOs should also create systems to monitor changes in how data is handled or defined that could throw off the algorithm, he says. Another key: CIOs should understand how best to use algorithms, even if they can’t build algorithms of their own.
For example, if a payment service needs to figure out whether pending transactions could be fraudulent, it might hard-code an algorithm into its payment software. Or the algorithm could be run offline, with the results of the calculations applied after the transaction, potentially preventing future transactions. The CIO has to understand enough about what the service is and how the algorithm works to make such decisions, Pouliot says.
CIOs should, of course, provide the technology infrastructure to run corporate algorithms, and the data they require, says Mark Katz, CIO of the American Society of Composers, Authors and Publishers, which licenses, tracks and distributes royalties to songwriters, composers and music publishers.
Katz meets regularly with ASCAP’s legal department to make sure the results of the algorithms comply with the organization’s charter and pertinent regulations.
“We’re all information brokers at the end of the day,” he says.
CIOs can expect increasing scrutiny of analytics programs. The Federal Trade Commission, in particular, is watching the use of algorithms by banks, retailers and other companies that may inadvertently discriminate against poor people. An algorithm to advise a bank about home loans, for example, might unfairly predict that an applicant will default because certain characteristics about that person place him in a group of consumers where defaults are high.
Or online shoppers might be shown different prices based on criteria such as the devices they use to access an e-commerce site, as has happened with Home Depot, Orbitz and Travelocity. While companies may think of it as personalization, customers may see it as an unfair practice, Luca says.
The Consumer Federation of America recently expressed concern that, in the auto insurance industry, pricing optimization algorithms could violate state insurance regulations that require premiums to be based solely on risk factors, not profit considerations.
Consumers, regulators and judges might start asking exactly what’s in your algorithm, and that’s why algorithms need to be defensible. In a paper published last year in the Boston College Law Review, researchers Kate Crawford and Jason Schultz proposed a system of due process that would give consumers affected by data analytics the legal right to review and contest what algorithms decide.
The Obama administration recently called on civil rights and consumer protection agencies to expand their technical expertise so that they’ll be able to identify “digital redlining” and go after it. In January, President Obama asked Congress to pass the Consumer Privacy Bill of Rights, which would give people more control over what companies can do with their personal data. The president proposed the same idea in 2012, but it hasn’t moved forward.
Meanwhile, unrest among some consumers grows. “Customers don’t like to think they are locked in some type of strategic game with stores,” Pasquale says. CIOs should be wary when an algorithm suddenly produces outliers or patterns that deviate from the norm, he warns. Results that seem to disadvantage one group of people, he says, are also cause for concern. Even if regulators don’t swoop in to audit the algorithms, customers may start to feel uneasy.
As Harvard’s Luca puts it, “Almost every type of algorithm someone puts in place will have an ethical dimension to it. CIOs need to have those uncomfortable conversations.”