The Challenges of Managing Mountains of Information

If you think the storage systems in your data centers are out of control, imagine having 449 billion objects in your database, or having to add 40 terabytes of new data each week.

By John Brandon
Tue, October 18, 2011

Computerworld — If you think the storage systems in your data centers are out of control, imagine having 449 billion objects in your database, or having to add 40 terabytes of new data each week.

The challenges of managing massive amounts of big data involve storing huge files, creating long-term archives and, of course, making the data accessible.

While data management has always been a key function in corporate IT, "the current frenzy has taken market activity to a whole new level," says Richard Winter, an analyst with Wintercorp Consulting Services, a firm that studies big data trends.

New products appear regularly from established companies and startups alike. Whether it's Hadoop, MapReduce, NoSQL or one of several dozen data warehousing appliances, file systems and new architectures, the data analytics segment is booming, he says.

"We have products to move data, to replicate data and to analyze data on the fly," says Winter. "Scale-out architectures are appearing everywhere as vendors work to address the enormous volumes of data pouring in from social networks, sensors, medical devices and hundreds of other new or greatly expanded data sources."

Some shops know about the challenges inherent in managing really big data all too well. At Amazon.com, Nielsen, Mazda and the Library of Congress, this task has required adopting some innovative approaches to handling billions of objects and petascale storage media, tagging data for quick retrieval and rooting out errors.

Taking a metadata approach

The Library of Congress processes 2.5 petabytes of data each year, which amounts to around 40TB a week. Thomas Youkel, group chief of enterprise systems engineering at the library, estimates the data load will quadruple in the next few years as the library continues to carry out its dual mandates to serve up data for historians and preserve information in all its forms.

The library stores information on 15,000 to 18,000 spinning disks attached to 600 servers in two data centers. Over 90% of the data, or more than 3PB, is stored on a fiber-attached SAN, and the rest is stored on network-attached storage drives.

"The Library of Congress has an interesting model" in that part of the information stored is metadata -- or data about what is stored -- while the other is the actual content, says Greg Schulz, an analyst at consultancy StorageIO. Although plenty of organizations use metadata, Schulz explains that what makes the Library of Congress unique is the sheer size of its data store and the fact that it tags absolutely everything in its collection, including vintage audio recordings, videos, photos and files on other types of media.

Continue Reading

Originally published on www.computerworld.com. Click here to read the original story.
What is Tech Briefcase?
TechBriefcase is a new, free service where IT Professionals can Search, Store and Share IT white papers and content like this. Learn more
Bookmark content
Speed up your research efforts with content across the web.
Search and Store
Find the white papers you need. Create folders for any topic.
View Anywhere
Open your briefcase on your iPhone, tablet or desktop. Share with colleagues.
Don't have an account yet?
Read this new eBook to learn the top five scenarios and essential best practices for preventing database attacks and insider threats.
In this report, Enterprise Strategy Group reviews how HP's portfolio of hardware, software, and services can provide the foundational support for VMware environments. When it comes to business continuity, HP Converged Storage streamlines virtualization initiatives, accelerating realization of the business benefits that contribute to IT's ability to maintain high service levels and customer satisfaction.

Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
This report defines "tier-1" storage in the modern IT world and in the data centers and services that support it. What was a simple environment just a few years ago with mainframes or a few large servers to be supported has evolved into a complex web of virtual machines, clouds, and expanding user expectations -- factors which demand and create flexibility, but do so in a way that pushes a lack of predictability upon the storage infrastructure. Learn what your criteria should be for tier-1 storage.

Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
Server virtualization has transformed corporate IT -- companies have enjoyed major cost savings and have gained flexibility and efficiency. But this has also led to a proliferation of virtual machines and servers that threaten to overwhelm data movement and storage technologies. In this IDG Tech Dossier, learn how utility storage makes for massive consolidation, flexibility and scalability, so IT departments can reduce storage infrastructure and lower costs while improving their ability to respond to fast-changing needs of business units.

Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
Forrester Consulting provides an analysis of four HP 3PAR storage customer implementations to quantify the efficiency and cost savings achieved over legacy storage platforms. On average, HP 3PAR storage customers achieved a 10.4 month payback with a 55 % ROI over a 3-year evaluation period and a significant reduction in CapEx and OpEx over that same period as a result of thin provisioning, maintenance costs avoided and labor productivity gains.

Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
Learn how HP CloudSystem Matrix and HP 3PAR Utility Storage provide a solid, flexible foundation for your cloud environment.

Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
Date: Tuesday, July 17, 2012 2:00 PM EDT

Traditional NAS systems don't scale beyond fixed limits. Proliferation of NAS systems leads to management challenges. Many organizations also use traditional block-based SAN solutions for transactional systems like databases and email. Having separate block and file storage also adds to management challenges.
Need to do more with less? Watch this video to learn how HP ProLiant Gen8 servers can help your business deploy servers three times faster; initiate problem anaysis five times faster; increase administrator productivity three times; and experience storage performance six times faster.

Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
Business users increasingly demand 24x7 availability of their data while IT departments face the challenge of ensuring maximum availability while operating with limited budgets.
Date: May 31, 2012
Time: 1 PM EST

Organizations are reaping the benefits of simplifying IT, lowering costs and dramatically improving transactional throughput by deploying optimized application-to-disk solutions. These pre-tuned, tested solutions encompass a wide variety of applications and use cases. Hear from industry experts, and IT executives, how these full-stack solutions can achieve three times faster deployment times and up to 75% reductions in acquisition and operational costs.
Learn how to reduce IT management overhead, ease revision control, guarantee data security, scale systems more quickly and reduce server and software costs.
Find out when you join EMA Senior Analyst, Torsten Volk, for a discussion on the 2012 trends in workload automation and how these trends contribute to better connecting workload automation to business processes. These trends are derived from EMA's empirical research work conducted for the 2012 Workload Automation Radar Report.
Newsletter Sign-Up »

Receive the latest news test, reviews and trends on your favorite technology topics

Choose a newsletter
  1. View all Newsletters | Privacy Policy
Sponsored Links

Master the cloud with the power of convergence from HP

Connect with IT leaders redefining mobility at the Enterprise Mobile Hub

Choose New and manage one device instead of 170

Choose New for 8x the firewall and NAT performance

Check out a smart way of mobilizing your business with enterprise-ready Samsung Mobile.

Redefine your data center with HP servers.

Enhance your business with Windstream IT Solutions. Speak to someone local.

BlackBerry® Mobile Fusion. Different mobile devices. One platform.

Click to see how Accenture has delivered high performance to clients

CYBERMARYLAND | Learn Why Maryland is the Epicenter for Cybersecurity

Get Ethernet speeds from 1 Mbps to 10 Gbps - Comcast Business Class

Cognizant. Leading in Business, Application & Technology Services

Collaboration: driving better business outcomes

Gain cutting-edge insights at MIT in 2-5 day executive programs.

Complimentary Gartner Report on BYOD: Media Tablets & Beyond. View Now

Elevate storage agility and efficiency with HP 3PAR storage.

Choose New and slash the number of devices you manage

Customized information views & Twitter events at New Fulcrum Point

Splunk translates machine data into "aha" moments for IT and the business.

ManageEngine Desktop Central - Automate and Audit Your Desktop Management! Learn More...

Cloud Readiness Starts with Intel® Technology

High performance. Delivered. Click to see Accenture's client successes

Visit the Virtually There Learning Page to learn how to use virtualization to your competitive advantage.

Free: Hunter Muller's "The Transformational CIO."

Join us for an upcoming Microsoft 365 live online demo event.

Discover your easiest path to unified communications

Virtualizing Your Infrastructure Just Got Easier

Connect with global CIOs now at Enterprise CIO Forum

Resource Center