In-Memory Technology Speeds Up Data Analytics
Long the purview of financial firms looking for an edge as they make lightning-fast transactions, in-memory technology is starting to catch the attention of many firms that conduct real-time analysis.
Wed, June 26, 2013
CIO — There's fast, and then there's mind-numbingly fast.
Just ask AdJuggler, an Alexandria, Va.-based company that runs a Software-as-a-Service ad-serving platform. The company's ad serving business was always fast-paced, but the advent of real-time bidding has taken speed to a new level.
In real-time bidding, a publisher sends an ad impression to an online exchange that puts out a request for bids. When a user arrives on a particular Web page, advertisers tender bids, and the highest bidder's ad is placed on the page. The digital ad sale happens quickly; according to Ben Lindquist, vice president of technology at AdJuggler, a buyer in a real-time bidding scenario has a 100 millisecond window to bid on a given impression.
It's that kind of speed requirement that led AdJuggler to purchase an in-memory data management product, Terracotta's BigMemory. The in-memory technology is set to debut in a limited use case later this month as part of AdJuggler's next-generation ad-serving platform.
Such deployments move the database from its traditional home in disk storage, placing it instead in memory. This approach boosts database query response times, since the trip from memory to processing core is much faster than searching for data housed on disk.
Mike Allen, vice president of product management at Terracotta, a wholly owned subsidiary of Software AG, says that, as a rule of thumb, memory is 1,000 times faster than disk. For AdJuggler, with its ultra-narrow bidding window and high transaction volume, that speed difference tipped the balance in favor of in-memory.
"Bidders can't afford to spend a significant portion of that window doing disk seeks," Lindquist says. "That is what it has come down to."
Moving to In-memory From Disk Means Less Database Tuning
AdJuggler's current platform, which matches ads to places on Web pages at a clip of 20,000 transactions per second, includes a mySQL database. The database houses configuration data on customers' campaigns—essentially, ad placements on various websites. Lindquist says all that configuration data will move from the disk-based mySQL data store to Terracotta's in-memory technology. AdJuggler will also add multiple terabytes of anonymized audience data.
"We will end up with a record in there for every user who goes to a piece of content that can be served an ad through our system," Lindquist says, adding that the user data will amount to hundreds of millions of records.
That data store will further multiply since AdJuggler customers will be permitted to place their own, proprietary audience data into the Terracotta data management system. As for throughput, the new platform will be able to grow to support at least 1 million transactions per second, Lindquist notes.
The in-memory shift expands the possibilities for a database involved in real-time decision making, Lindquist says. Previously, getting a database to perform at the now-required level would call for a significant amount of tuning—configuring memory and carving out a data cache in RAM to improve performance.
A cache hit is quicker than going back to disk for data, but a cache typically represents a small portion of the data stored in a database. Lindquist notes that mySQL performance depends on having the right piece of data in memory at the right time.
Why not put all the crucial data there? "We decided it's all got to be in memory," Lindquist explains, "so you don't have to worry about the tremendous amounts of database tuning you typically would have to do." AdJuggler will run a Terracotta cluster, using the distributed version of the company's BigMemory data management software."