”Just having a pile of data is not that useful,” says Phil Kim, CTO of Bundle. “You have to ask the right questions and process the data in the right way where you can do meaningful analysis.”
While that’s good advice for any company, at Bundle, having well-managed data is critical to operating its new recommender service, called Everybody’s Money. Launched in May 2011 and covering New York City and San Francisco, Everybody’s Money aims to offer customers a “more holistic” recommendation engine for dining and retail establishments, says Jaidev Shergill, CEO and founder of Bundle, an online personal finance service and a spinoff of banking giant Citi. What sets Bundle’s service apart from Yelp, Zagat and similar online guides is how the recommendations are generated: Rather than soliciting opinions from patrons, Bundle uses the anonymized transactions of 25 million Citi credit card holders, cross-indexed with Census Bureau data and other third-party demographic information.
Before the source data can be used, Kim explains, it’s scrubbed of any identifying information, tagged and aggregated to form a picture of each retailer it profiles.
Value for Customers
Credit card transactions offer a better glimpse into the value of an eatery because they track how many times patrons return and how much they spend each visit, notes Shergill. An establishment with lots of repeat traffic is probably a good bet. Moreover, comparing restaurants’ average bills can help a user identify places that are better values. Two restaurants near one another may each have a loyal following, but the one with the lower average bill may be easier on the wallet.
The software assembled to make these recommendations consists mostly of off-the-shelf tools, many of them open-source. For example, data is stored in MySQL and Microsoft SQL Server databases. The analysis is done using the Apache Hadoop open-source data-processing framework.
The basic data, once it’s stripped of any personal information, is not that different from what you see on your credit card statement: the date of a transaction, the payee and the amount. From this string, Bundle’s algorithms will attempt to derive additional information, such as the type of the business—for example, a restaurant or a shoe store—and its location.
Finally, the data is aggregated and the company forms a picture of each retail outlet based on the buying habits of its credit-card-holding customers. Bundle can see, for example, whether a restaurant is a neighborhood favorite—people from a particular ZIP code frequent it—or if it attracts visitors from out of town. And by combining that information with external demographic data, it can report on the characteristics of patrons, such as their average age, their affluence, and even how often they eat out.
Data that has been cleansed and tagged can easily be reused, so Kim and his team are searching for more ways to do so. One idea: combining transaction information with the personal financial data users track with Bundle’s My Money tool, to provide recommendations and money-saving tips based on users’ behavior and preferences.