by Nick Booth

Dodge the data deluge

Sep 27, 20126 mins
IT Strategy

Many people in IT are terrible hoarders, keeping things that should have been ditched years ago. The justification for this is usually that this hardware could come in handy some day, or that it would be fatal to remove them.

Many banks have rogue routers and ghost gateways on their networks which they are pretty sure add nothing, but nobody dares remove them in case they unwittingly sever a nerve that brings the whole network down.

Much the same could be said of data. These days, marketing departments are convinced that every scrap of data holds vital clues about customers. So you can’t chuck out any files, no matter how many millions of copies there are of them, just in case they’re worthy of an analysis.

Added to this, most CIOs work in heavily regulated industries, where they dare not delete even the most innocuous word document.

Which is why the cost of storing all your data continues to soar, despite all the advances in technology that halve the cost of storage every 18 months. Managing data is labour-intensive, but storing it is relatively cheap, so financial logic says you should keep throwing hardware at the problem of all that big data.

“Storage is getting cheaper if you want cheap and cheerful and keep away from the big brands,” says Richard Fox, head of IT at distribution company Gem.

Recent storage innovations like de-duplication have lessened the burden, he says.

By only storing unique data, and not the endless identical backups, each record can be shrunk to a tenth of its size which means much less data has to be transferred between sites for backups.

However as data needs grow exponentially, the falling cost of each gigabyte merely encourages more fat files to be laid down.

The trap that many CIOs fall into is over ambition. The popular theory is that all the tons of image files and social media chit chat are going to yield incredible insights into the customer base, just as soon as the analysis tools can be honed to make sophisticated searches.

Meanwhile, many complex management layers, bought at great expense for storage systems, go unused. This is a tremendous waste, says Fox.

“Some of these features are excellent value and have obvious business benefit when they’re incorporated into backup and data recovery strategies,” he says.

The problem is not everyone gets time to learn how to use these features.

“If you don’t make use of these features you’re probably paying more than you need for storage,” Fox adds.

Shouldn’t CIOs be encouraging end users to create less, store less and delete more? The problem is that few end users realise that most of the data they create they never even see.

“Users are typically unaware of the automatic versioning, database and file replication, desktop and laptop shadow copying that take place until they’ve lost something,” says Fox.

“All those features are needed, and save a lot of hours. If you have the space, use it.”

But if you want to change the way users amass files, you could have a revolt on your hands, warns Fox.

“People just expect unlimited data storage: they have it at home with their email, so why not at work? People love their files, their movies and photos and monster Excel files. Humans are natural hoarders.”

So what can firms do to dissuade users from storing so much data – much of it personal?

“Many firms have 10 copies of backups around,” points out Mac Scott, partner at KPMG’s Advisory Practice.

“Archive and backup strategies aren’t sexy so have been at the background of thinking and investment,” says Scott. But, “with costs coming home to roost now,” perhaps that could change.

Another option is to bill departments according to their use of facilities, a strategy which could be driven by the cloud providers.

“Cloud providers and data storage providers might want to consider charging more for data as it ages. This would encourage businesses and consumers to have a pro-active archiving policy,” says Garry Lengthorn, director of IT services at recruitment company SThree.

This is not likely to happen, concedes Fox. It is doubtful whether each department will fully appreciate its legal requirement to keep certain types of data while deleting other types of record.

It’s taken years for the IT profession to learn this discipline, for a start. A more practical measure might be to make people more wary about saving useless data in the first place.

One way to do this would be to instil the discipline to catalogue everything they save. When anyone has to fill in a form, with several fields of metadata, for everything you commit to the public domain, that soon makes them think twice about publishing it and will ultimately improve the search process.

“We have to make people more efficient at sharing their data through indexing and search features, so people can access all that extra data more quickly than ever,” says Fox.

It’s time to be bold as there are massive reductions to be made, according to Steve Shelton, head of data at BAE Systems Detica.

“Huge volumes of data could be deleted,” says Shelton.

Detica helped one UK telco shed 500TB through a combination of data compression techniques, de-duplicating data, reducing unused storage allocation and decommissioning redundant systems.

One of the important areas for economies is that masses of superfluous business continuity data.

“Rigorous data management is a massive efficiency and great for the environmental footprint of the datacentres,” says Shelton.

Tackling the over-provisioning of storage area networks and over-tiering of storage arrays is a lot easier than training users, argues Galvin Chang, associate director at storage specialist Infortrend.

Using a tool like Virtual Networks (which specialises in sorting out SANs) you can cut costs by 40 per cent, whereas it takes a superhuman effort to get 40 per cent productivity spikes from the workforce.

Native Format Optimisation comes highly recommended by Christoph Schmid, Chief Operating Officer at Balesio, who says it can cut data footprint of files efficiently by up to 90 per cent.

CommVault’s business development director Simon Gregory argues that unifying the data management platform with their Simpana platform removes the silos of data without neutralising your capacity for growth trending, prediction and cost metrics.

On the other hand Radek Dymacz, head of R&D at Databarracks, says you could make a start by treating different file types differently.

“Marketing will now have large numbers of high-res images, as well as videos of product demos and trade fairs,” he says.

These static files are perfect for Object Storage systems like Amazon’s S3 or OpenStack’s Swift. These are fundamentally different systems for storage but could see some real cost savings, he predicts.

But don’t be too optimistic about any of these measures, warns independent consultant Dr Graham Oakes.

“Never underestimate the ability of a determined person to defeat these technologies, whether intentionally or accidentally,” he says.