How to Secure Big Data in Hadoop
The promise of big data is enormous, but it can also become an albatross around your neck if you don't make security of both your data and your infrastructure a key part of your big data project from the beginning. Here are some steps you can take to avoid big data pitfalls.
Thu, November 08, 2012
CIO — Big data promises to help organizations better understand their businesses, their customers and their environments to a degree that you could previously have only imagined.
The potential is enormous—as businesses transform into data-driven machines, the data held by your enterprise is likely to become the key to your competitive advantage. As a result, security for both your data and your infrastructure becomes more important than ever before.
Big Data Could Be Toxic Data If LostIn the case of data that provides a competitive advantage, the need for security should be obvious. If you lose that data, or it winds up in the hands of a competitor, your advantage is lost. But worse, it could become a liability.
In many cases, organizations will wind up with what Forrester Research calls "toxic data." For instance, imagine a wireless company that is collecting machine data—who's logged onto which towers, how long they're online, how much data they're using, whether they're moving or staying still—that can be used to provide insight to user behavior.
That same wireless company may have lots of user-generated data as well: credit card numbers, social security numbers, data on buying habits and patterns of usage—any information that a human has volunteered about their experience. The capability to correlate that data and draw inferences from it could be valuable, but it is also toxic because if that correlated data were to go outside the organization and wind up in someone else's hands, it could be devastating both to the individual and the organization.
With Big Data, Don't Forget Compliance and Controls"Most of the big data projects we've been exposed seem kind of frenetic," says Larry Warnock, CEO of Austin, Texas-based Gazzang, which specializes in data security solutions and operational diagnostics. "There seems to be a mad dash to access this data, and some of the old-school compliance and controls have sort of been left for phase two of the project. If you go so fast that you lose sight of basic best practices, companies may get themselves into a bit of a bind." "Hadoop and similar NoSQL data stores enable any organization—large or small—to collect, manage and analyze immense data sets, but these nascent technologies were not necessarily designed with comprehensive security in mind," adds Dustin Kirkland, CTO of Gazzang. "As these repositories grow in popularity and size, the potential for sensitive data to get swept up and stored is significant."
9 Tips for Securing Big DataHere are some specific steps you can take to secure your big data:
- Think about security before you start your big data project. You don't lock your doors after you've already been robbed, and you shouldn't wait for a data breach incident before you secure your data. Your IT security team and others involved in your big data project should have a serious data security discussion before installing and feeding data into your Hadoop cluster.