MongoDB Refines Load Balancing
The MongoDB data store can now randomize placement of data on different shards to smooth load balancing
Tue, March 19, 2013
IDG News Service (New York Bureau) — Following the tradition set by recent versions, the new release of the MongoDB NoSQL data store comes with a batch of new features designed to appeal to the enterprise market, including a new built-in search engine, more support for geospatial data and the ability to balance workloads across multiple servers more effectively.
"We're moving more quickly," said Kelly Stirman, 10gen director of product marketing, referring to how MongoDB's growing user base is giving the company more resources. MongoDB 2.4, available Tuesday, was released only six months after the last major version, 2.2 "We've substantially increased the size of the engineering corporation and the company has grown dramatically over the past year," Stirman said.
The company has also updated the commercially supported version of this open source data store, called MongoDB Enterprise.
Since 10Gen began work on MongoDB in 2007, the data store has been downloaded more than 4 million times. The document data store was designed to ingest and read large amounts of data very quickly, and has proved itself to popular in the fields of analysis, content management, mobile and social infrastructure and user data management. 10gen supports more than 600 commercial customers with the enterprise version of the data store, including Craigslist, Disney, Electronic Arts, eBay, Foursquare, Intuit, LexisNexis, MTV, Salesforce.com and Telefonica.
One of the chief new features of MongoDB 2.4 is hashed-based sharding. Sharding takes place when different parts of a data table are spread across multiple servers. Hashed-based sharding randomly distributes new entries across all the available servers. As a result, data is distributed more evenly, minimizing hotspots that occur when too much frequently consulted data -- such as recently captured data -- gets placed on a single server.
"You get a nice distribution of all the documents across all the shards for reads and writes," Stirman said, adding that range-based sharding -- which was the previous default sharding algorithm -- will continue to be available.
The new built-in search engine, still in beta mode, may eliminate the need to maintain an external search engine, such as Apache Lucene/Solr. It offers simple text search, so it does not have all the capabilities of stand-alone search engine, such as natural language processing. But because it is built into MongoDB itself, it is a lot easier to configure and maintain, Stirman said.
"It will be good enough for a lot of applications, and the community will be excited because they won't have to worry about integrating another technology, especially if they are deploying across multiple data centers. With MongoDB, that is pretty easy," Stirman said.
MongoDB 2.4 contains a number of new techniques that may allow developers to make better use of data. One new feature, capped arrays, should find itself useful in interactive Web 2.0 environments. A capped array is an array with a predefined limit of the number of items it can hold. So it could be used for a website for displaying the 20 most popular user comments, for instance.
A sizeable portion of MongoDB users are using the data store to capture location-based information, so this version will offer a number of enhancements in this realm, including a more accurate spherical model of Earth. it will also support GeoJSON, an open format for encoding geospatial data so it can be transferred among different systems.
On the performance front, MongoDB's count operations have been refactored, so now index-based counts are 20 times faster than in previous editions. The data store's aggregation framework, which can be used as the basis for real-time analysis of data, is now three to five times as fast for many operations, the company claimed.
MongoDB Enterprise offers additional capabilities. The new version comes with on-premises monitoring tool, one that can watch over 100 operational metrics in real time of a running MongoDB system. Previously, 10Gen offered such metrics as a service, though this will be the first version of the software to offer the metrics for internal systems.
MongoDB Enterprise also supports Kerberos authentication, allowing enterprise applications to securely authenticate to MongoDB with this standard.
MongoDB Enterprise starts at US$5,000 per year, which includes 24 hour support and a one hour response time for critical items.