Presidential Election a Victory for Quants

If there was one lesson for political pundits from last week's presidential election, it was that basic statistical modeling techniques can be used to predict election outcomes with stunning accuracy.

As far back as June , Drew Linzer, an assistant professor of political science at Emory University, predicted that Obama would win reelection and secure at least 52%of the popular vote. Like Silver, Linzer also had Obama winning 332 electoral college votes and Romney taking home 206 votes.

Even as political pundits breathlessly forecast a tight race, Linzer's blog site Votamatic consistently had the president winning the election by a small but comfortable margin.

"I never saw it as being a close race" Linzer said speaking with Computerworld this week. "When I started producing my forecast in late May, the historical model that I was using showed that Obama would get about 52% of the major party vote."

Despite minor fluctuations in support for both candidates in the weeks leading up to the elections, such as immediately after the first debate, the data always showed Obama winning in the end, he said.

Linzer, like Silver, made his forecasts by aggregating state-level poll data with economic indicators and data from previous polls. He started by constructing a baseline forecast for each state by using a statistical model developed by Alan Abramowitz, a fellow Emory professor , who used the model to predict the outcome of the 2008 elections.

The model, called Time-For-Change, predicts the incumbent party candidate's national vote share by looking at factors such as the president's approval rating in June, the percentage change in gross domestic product in the first two quarters of the year, and the number of years the incumbent party has held the presidency, Linzer said.

Historical data shows that these measures are especially useful indicators of how a first-term president will fare in the elections, Linzer said. For instance, since 1948, presidents that have been popular in June have been much more likely to get reelected in November, he noted.

As the weeks progressed, Linzer began basing his forecasts increasingly on state-level opinion poll data and less on the historical data that he had used to build his baseline model. "When I started off in May and June, the forecasts were based on long-term fundamental economic and political variables," because there was little poll data available at the time.

As more poll data became available, it was thrown into the mix. "The basic idea is that on Election Day or the weeks leading up to Election Day, the polls are the best indicator," of an outcome, he said.

One of the mistakes that many pundits were making was to look at national-level poll data to predict the outcome, Linzer said. National polls often are unable to detect local trends and patterns with the same level of granularity that a state poll does. "In a national poll, I would only get a few respondents from an Ohio or a New Hampshire where the elections are being decided," Linzer said.

The Signal

Yahoo's The Signal blog accurately predicted the outcome in all states except Florida by applying a similar approach.

David Rothschild, chief economist at Microsoft and developer of the model used by The Signal, called the blogs' forecasts "a triumph of science over punditry."

As far back as February this year, Signal came out with a baseline forecast predicting an Obama win even before the Republican nominee was known. Rothschild too looked at historical data, state level economic indicators and factors like the President's approval rating and the advantages of incumbency to build the first baseline forecast.

Looking at economic data at the state level offered far better insight than national economic conditions, Rothschild said, For instance, though both Ohio and Virginia were thought to be leaning towards the Republican challenger in February, The Signal predicted Obama would win, because local economic conditions were stronger than average in both states, he said.

The forecasts were also based on long-term economic trends rather than economic snapshots, he said. For instance, instead of looking at the absolute unemployment numbers or inflation levels for any given month, The Signal's forecasting model assigned more weight to broader employment trends to make its predictions.

As the campaigns progressed and more poll data started becoming available, The Signal's model started increasingly using that data rather than historical and economic indicators to base its predictions.

"We saw a diminishing return for economic movement after the second quarter," Rothschild said. "Unless something very drastic happens in the third quarter the narrative usually has already been written," at least from an economic standpoint, he said. So factors like the late-breaking jobs numbers, had less impact on the outcome that many pundits might have prognosticated, he said.

"For most of the election cycle, we had Obama at around 303 [electoral college votes]," Rothschild said. "At no point did we not think the president was leaning certain or extremely certain to win," reelection, he said.

FHQ

At the end of the day, it was the accuracy of the state level polls that made all the difference, said Josh Putnam, a visiting professor of political science at Davidson College and author of FHQ, another blog that early on predicted a 332-206 Obama electoral college vote victory.

"If we were all making sausage, then FHQ and the others were merely turning the crank on our various sausage make apparati," Putnam wrote on his blog. "The filling -- the polls -- was what really nailed the election projection on the state level."

If the polls had been wrong, so would the forecasts, Putnam said in an interview with Computerworld.

Like the others, Putnam too focused on state level polls to draw his conclusion. But instead of using any statistical models, Putnam simply aggregated state-level poll data to arrive at his forecasts.

"I took a rather pedestrian approach," compared to the others, he said. "It was not very complicated. My forecasts were based simply on a weighted average of poll data," he said. "The value of my methodology is to set up a nice little baseline that is fundamentally based on the polls," he said. "The polls were very good this time. That helped me," he said.

covers data security and privacy issues, financial services security and e-voting for Computerworld. Follow Jaikumar on Twitter at

Twitter
@jaivijayan or subscribe to Jaikumar's RSS feed
Vijayan RSS
. His e-mail address is jvijayan@computerworld.com.

This story, "Presidential Election a Victory for Quants" was originally published by Computerworld .

Join the discussion
Be the first to comment on this article. Our Commenting Policies