Political candidates aren't the only ones hoping to sway voters this election season; plenty of other groups are engaging in campaigns -- and those efforts are increasingly driven by big data, even at smaller organizations with limited resources.
The Sierra Club, for instance, doesn't have the resources of a national presidential campaign. Unlike the Obama for America campaign, it can't hire a small army of predictive analysts and data scientists to model every aspect of its election-year strategy. But the nonprofit is using data mining to identify swing voters who are most likely to be motivated by its environmental message -- and who are most likely to be moved to vote for the candidates the club has endorsed in the 2012 election.
What politicians know about you
By combining voter records with their own donor lists, consumer databases and online information, political campaigns and advocacy groups can assemble highly detailed databases with hundreds of fields, or bits of information, that describe each voter's party affiliation, likes and dislikes, socio-economic background and more. This data comes from sources such as these:
State voter registration rolls. These lists include voters' names and addresses, as well as registration status, party affiliation and voting history.
Consumer databases. Available for purchase from third-party marketing firms, these databases contain socio-economic and demographic data, including details such as information about people's hobbies, interests, lifestyles and magazine subscriptions.
Campaign donor and volunteer databases. These repositories include people's names, street addresses, email addresses, contribution histories (with the dollar amounts of their donations) and volunteer activities. They might even contain information about civic actions people have taken, such as signing petitions.
Miscellaneous online sources, including websites and social networks like Facebook, Groupon, Twitter and LinkedIn. Records of people's activities on social media sites represent a rich source of psychographic data (interests, hobbies, lifestyles). That information is integrated with data from traditional sources through a process called cookie matching, by scraping sites for information or by encouraging voters to self-identify, which they do when they, for example, like a campaign or advocacy group's page or click through to the campaign's website and respond to a request for a donation.
For the Sierra Club and organizations like it, the objective isn't to win at any cost, but to win cost-effectively.
"We target both people who might sit the election out unless sufficiently motivated and folks who may be undecided with a message that will be effective," says Sierra Club political director Cathy Duvall. In this way, the organization doesn't waste resources reaching out to voters who are already on board or those who are unlikely to be persuaded. "We have a more clean shot at the voters we want, and in most cases the return on investment is immediate," she says.
It's a two-step process, Duvall explains. Analysts apply data mining techniques against a massive database that provides very detailed profiles of its own members as well as "look-alikes" who fit the profile of swing voters. From there they develop models that predict which voter profiles will be most likely to respond positively to a campaign message and which type of issue will be most likely to move them to action.
"In some instances, we can take this research a level deeper through real-world experimentation," Duvall says. To accomplish this, Sierra staffers try out a range of specific messages on test groups to determine which will be the most effective before launching the campaign to the target audience. "We can see which messages are moving the voters. Before we could do cross-tabs and see the broad categories of people who might be moving, but with data mining we can go much deeper."
The 2012 election is shaping up to be the year of the data-driven, big data campaign. Political operatives in virtually every campaign, and across the political spectrum, are applying data mining techniques to mountains of new information from online sources that offer unparalleled insights into voter interests and habits.
For example, armed with more data, analysts can predict more accurately how individuals are likely to vote and whether they are Republicans or Democrats.
The ability for niche groups "to communicate only with people likely to support their cause didn't exist four years ago," says Patrick Ruffini, president of Engage DC.
As they combine online data -- including social media posts -- with traditional data sources such as consumer databases, analysts can target groups of voters that fit very detailed profiles and choose the messages that will be the most likely to achieve the desired response. This sort of analytics work, known as microtargeting, was already under way during the last presidential election cycle. But since then, the amount of information available about individual voters has exploded. Campaigns have become more sophisticated in its use, and the tent has expanded, with smaller advocacy groups and campaigns coming on board.
"That ability for niche groups, such as the Sierra Club, to communicate only with people likely to support their cause didn't exist four years ago," says Patrick Ruffini, president of Engage DC, a firm that handles online advertising and analytics work for the Republican National Committee and individual Republican candidates.
In search of the like-minded
Many of the voters the Sierra Club wants to reach aren't in its own member database, so Duvall works with Catalist, a consortium of progressive organizations that maintains a 500-terabyte database of information describing both registered and unregistered U.S. voters.
"Our database is about civic behavior and transactions, what issues you care about, what causes you support, whether you tend to vote or not, and so on," says Catalist CEO Laura Quinn.
Catalist matches up the Sierra Club's member database with its own data and provides access to the full database, which combines state voter registration lists with commercial consumer data that includes demographic (race, gender, age, income) and psychographic (interests, hobbies, lifestyles) information on individuals and households. Catalist buys commercial consumer data from traditional data aggregators and reporting agencies such as Acxiom and Equifax. Voter lists come from the states.
For those states that don't release voter registration data, Catalist has developed models that predict, at the household level, who is likely to be Republican or Democrat and how they're likely to vote -- something it couldn't do in 2008. "Our database is about civic behavior and transactions, what issues you care about, what causes you support, whether you tend to vote or not, and so on," says Catalist CEO Laura Quinn.
Yair Ghitza, senior scientist at Catalist, explains further: "Our clients determine the likelihood that someone is going to vote, care about certain issues or has leanings on certain issues, their partisanship and ideologies, and the actions they're most likely to engage in when they take civic action," he says.
Aristotle Inc. offers a similar service to trade associations and campaigns, including both presidential campaigns, according to CEO John Aristotle Phillips. Its database of more than 700 data fields, which describe the traits of more than 85 million registered voters, is used for both fundraising and get-out-the-vote initiatives.
"What we're seeing in 2012 is much more effective use of real-time access" to databases about voters, says John Aristotle Phillips, CEO of Aristotle Inc.
Clients use it to create models that find people who are similarly minded or likely to contribute. Aristotle then helps them deliver a targeted message to individuals who match the criteria through various channels, including TV, direct mail, email and social media. The more sophisticated campaigns were doing this in the last election cycle, says Aristotle.
"What we're seeing in 2012 is much more effective use of real-time access to these databases. You know as contributions are coming in who else to email of a similar demographic," he says.
"Digital is no longer a separate division in campaigns," says Patrick Hynes, president of Hynes Communications, a consultancy specializing in online and new media communications strategy that currently serves as an adviser to the Romney campaign. "It's cross-portfolio -- everyone has to work in a digital environment."
But the next election cycle, he says, will be all about mobile. "Mobile will be first in the minds of everyone" -- for everything from polling to press releases, sentiment measurement and fundraising, he says.
Mobile is already changing the game, particularly in the area of door-to-door campaigning, where canvassers are increasingly taking advantage of mobile apps and the Square mobile payment service.
Square offers a small card reader that attaches to a smartphone, enabling the user to accept payments anywhere, at any time. Canvassers who use the device can take campaign donations right on voters' doorsteps.
As campaign volunteers go door to door, they might rely on mobile apps for customized messages about specific households. They could look at profiles that not only indicate whether an individual is a Republican or a Democrat, but also offer guidance about how much of a donation to ask for based on the person's past history of campaign donations. In addition, canvassers can use apps to capture details of interactions with voters and upload that information to the campaign database, thereby providing continuous, real-time feedback.
"The Obama campaign has taken it up a notch," says Engage's Ruffini. "They're recording what people say when they knock on doors. They make thousands of phone calls every night. They do text analysis, and then make decisions on TV and ad spending." (Obama for America did not return calls asking for comment.)
On the Republican side, mobile apps improve the efficiency of door-to-door campaigning, because they can tell canvassers exactly which doors to knock on in rural areas to reach the party faithful, independents and swing voters, says Hynes. Because Democrats tend to live in urban areas, Democratic campaign workers can be effective by canvassing entire neighborhoods. However, "Republicans live in the suburbs and exurbs, so it's been harder to go door to door," says Hynes, adding that mobile is helping to level the playing field.
Coming up, in part 2 of this story: Merging online and offline data, privacy issues and more.
Read more about big data in Computerworld's Big Data Topic Center.
This story, "Campaign 2012: Mining for Voters" was originally published by Computerworld.