The Tech Behind 236 EHarmony Members Getting Hitched Daily

While eHarmony's goal is to get its 20 million members married or into long-term relationships, the online matchmaker is a downright commitment-phobe in its use of technology.

By Eric Lai
Wed, September 16, 2009

Computerworld — While eHarmony's goal is to get its 20 million members married or into long-term relationships, the online matchmaker is a downright commitment-phobe in its use of technology.

Business Intelligence Definition and Solutions

For the business intelligence infrastructure that powers its matchmaking algorithms and maximizes the effectiveness of its numerous TV ads, the firm relies on four database and data warehousing products.

They include Oracle Database, the open-source MySQL database, another open-source data-crunching app, Hadoop, and data warehousing appliances from Netezza Inc.

For some IT managers, managing four such disparate products wouldn't worth the trouble. But not Joseph Essas, vice-president of engineering and operations for eHarmony.

"We always use multiple vendors for different things," Essas told the audience during his speech Wednesday at Computerworld's Business Intelligence Perspectives conference in Chicago.

Essas say she likes the "leverage from playing multiple people against each other." He fears that while settling down exclusively with one vendor might initially be a "bargain," it would eventually lead eHarmony to financially "bleed to death in years 2 to 5."

Essas' philosophy is interesting because it runs so counter to the site's goals, as a self-declared maker of long-term relationships.

eHarmony Inc. says that 236 of its 20 million members get married every day. That's more than 2% of American marriages per year, according to statistics based on online surveys conducted by a third party, Harris Corp., that were commissioned by eHarmony.

Related Story

Online dating: The technology behind the attraction

Marriage is only one of "hundreds of metrics" that eHarmony "deeply cares about," said Essas.

Tracking and crunching all of these metrics is key, as eHarmony must produce good matches for its members as soon as they fill out their profiles at sign-up, lest they lose them to rival dating sites.

"Their attention span with us is very short," Essas said, "So we need to get it right on the first try, if you will."

Assigning matches is a complex mathematical problem called "graph partitioning," said Essas.

eHarmony uses Oracle to do much of the initial matching. But for its hardcore data-processing, eHarmony relies on a 50-node Hadoop cluster. Hadoop is speedy, says Essas. "What used to take hours now with Hadoop takes just 30 minutes," he said.

That's important, because Harmony is rescoring its relationship matches whenever new members sign up, or even when existing members update their profile.

Hadoop also forces eHarmony to keep its data in key-value store form, rather than in a structured SQL format.

"It's really hard to build reusable data structures, especially at scale, in SQL," he said. Using Hadoop also makes it easier to figure out the cause of slow queries compared to using a SQL database, he said. And it forces eHarmony's developers to be more disciplined about what data it stores permanently, preventing the database from getting "too bloated."

Continue Reading

Learn how your answer to this question compares to your peers by taking this quick poll. See how your peers are dealing with the challenge of ensuring a highly capable server infrastructure as technological shifts impact the application server platform.
With increasing data growth, comes increased need for data security.  The existing DLP model, with a focus on compliance/enforcement is not sufficient as the data discovery and classification capabilities are not granular enough.  Read this paper to find how you can efficiently and accurately manage your risk by rapidly inventorying and classifying your data and then developing remediation workflows that support business needs. 
This paper breaks down attack sources into four categories: external, malicious insiders, accidental insiders, and unknown.
The rapid growth of data and technology is creating challenges for organizations as this digital data is considered to be business communications and must be preserved according the same industry-specific regulations governing the retention and discovery of emails and more traditional forms of electronic communications. This paper examines the role that Data Loss Prevention ("DLP") technology can play in helping organizations address the challenges of locating information in response to electronic discovery.
This research, conducted by the Ponemon Institute, focuses on issues relating to the use of data protection solutions such as endpoint encryption and data loss prevention within the workplace.
This report, by Jon Oltsik from Enterprise Strategy Group, examines the need for a new business-centric approach to DLP in order to align business and security requirements.
Too much information can be just as limiting as too little information if users can't get what they want when they want it. Find out how the IT leaders at one of Canada's leading law firms, Fraser Milner Casgrain LLP, implemented Recommind's next-generation content delivery and search platform within their SharePoint portal to enable timely and effortless access to the information users need.
As greater numbers of datacenter servers transition from the physical to the virtual world, the components of virtualization success come to the fore. What scores of organizations have discovered is that success is derived from an optimal pairing of the right software platform with the right hardware platform.
Have you been looking to hear about customer's experiences with the new VMware vCenter Site Recovery Manager product? View this webcast to learn about VMware customer, Navicure, and their experiences testing and evaluating the recovery manager, their progress in implementing it in their environment and their advice other customers considering using vCenter.
Many enterprises have discovered that the use of virtualization to support desktop workloads creates a range of significant benefits. These benefits include price efficiencies, improved IT management and greater agility and choice for end users.

This VMware sponsored webcast with IDC will provide both quantitative measurement of the business value -- defined as the expected ROI -- and qualitative analysis associated with the use of VMware View™. IDC will also provide an analysis of the View Composer and ThinApp™ features of VMware View, including the business value of these solutions and an overview of how they work.

Attend this webcast to learn about:
- Challenges and barriers that might impede the adoption of desktop virtualization
- Navigating roadblocks to facilitate a strategic implementation
- Optimizing qualitative and quantitative benefits to IT and your business
VMware recently announced VMware vFabric™ Data Director, a new database deployment and operations platform that enables enterprise IT organizations to offer database as a private cloud service. Built on top of VMware vSphere 5, vFabric Data Director enables IT organizations to ontrol database sprawl through automation and consistent policy enforcement and accelerate application development cycles with self-service database management. Attend this webcast to learn how vFabric Data Director can help you build database-as-a-service in your datacenter.
A simple, cost-effective disaster-recovery solution for virtual environments is high on the agenda for IT organizations as they virtualize more business-critical applications with VMware. VMware vCenter™ Site Recovery Manager-the market-leading disaster-recovery product-ensures the simplest and most reliable disaster protection for all virtualized applications. VMware vCenter Site Recovery Manager provides centralized management of recovery plans, enables nondisruptive testing and automates site-failover processes.
Newsletter Sign-Up »

Receive the latest news test, reviews and trends on your favorite technology topics

Choose a newsletter
  1. View all Newsletters | Privacy Policy
Sponsored Links
Resource Center