by Robert Lemos

Another Data Center Headache: Log Data Exploding

Feature
Apr 29, 20094 mins
Data Center

The newest storage headache for data centers? A worsening torrent of real-time log data. Bad news: For compliance reasons, you'll soon have to not only store more log data, but also make it more searchable. Good news: You can use this data to improve security.

Following the March, 2004, bombings in Madrid, Spain, law enforcement searching for leads on those responsible for the attacks focused on the cell phones used by the terrorists and requested that European telecommunications providers turn over their call data. The only problem: It took the companies weeks to find the relevant data.

In attempt to eliminate such problems in the future, the European Union created data-retention guidelines that require service providers to hold up to two years worth of call records and Internet records. The amount of data that the companies have to store skyrocketed—becoming a major data center issue.

“One of the issues is the volume of data,” says Matthew Aslett, enterprise software analyst for The 451 Group. “One European telco we have spoken to cited three years of data equating to 36TB of storage.”

The storage problem reaches far beyond Europe. While most companies use data centers to store their primary business information—such as backups of important files and customer data—real-time log data and unstructured transactional data are quickly becoming major issues as well, according to Aslett and other experts.

Most industries will face a significant data problem in the future, as compliance requirements force them to not only retain more data, but also make such data easily searchable.

Banks have to keep data from cash machines, utilities have to keep data on various events happening on their control and monitoring networks, and public companies need to document who accessed certain sensitive financial data to be compliant with Sarbanes-Oxley.

Much of the data is stored as event logs from a host of different devices on a network.

In the past, event data was not stored in a way to make retrieval easy. Every device on a network—whether a bank’s ATM network, a corporate local network or a utility’s control network—generates event data and storing that data has always been a problem. The issues will only become more significant in the future.

“Clearly some of the major drivers are SOX and PCI (requirements), for which security log management is a partial answer to the problem, but issues such as the EU data retention guidelines for electronic communications are potentially broader and larger problems in terms of the amount of data to be collected and analyzed,” he says.

Hewlett-Packard, one of many companies that sells systems to handle so-called event data warehousing issues, sees customers dealing with anywhere from 10 GB of data per day to 1 TB of data daily.

“There is a torrent of information coming out of these devices,” says Gary Lefkowitz, a director in HP’s Secure Advantage group.

Yet, once collected, the data becomes and opportunity for the company, he says. “A lot of customers look at this as a compliance tax, but once you get your system running, it is not like you are just checking off the compliance box—there are a whole host of things you can do.”

Companies that store such event data in a easily accessible way, for example, find that they can analyze the data for anomalous events that could indicate an attacker in their system, says Jim Pflaging, CEO of data-warehousing software provider SenSage.

“We think there is a class of customers that will really see this as a positive thing for the security of their company,” he says. “To nail insiders, you really have to collect more data. Insiders don’t have failed logins—you have to be able to analyze how they accessed the data.”

In the past, companies that collected log data in a single location would typically use a flat file, which made the data difficult to comb through for significant events, says Pflaging. Using more efficient database software to store and retrieve the data, companies also gain a lot more insight into what is happening amongst the devices on their network, he says.

“For most companies, this security log data will be the largest single data store,” Pflaging says.

Follow everything from CIO.com on Twitter @CIOonline