Many organizations are deploying Hadoop to help launch their big data projects. Unfortunately, Hadoop runs in non-secure mode by default, which means sensitive data is at risk of both internal and external threats. Given the value of data in Hadoop, it\u2019s critical that organizations secure their big data deployments before moving them into production.\nAs part of that effort, organizations should strictly control user access to nodes in a Hadoop cluster. At the same time, however, they also need a way to centrally manage access rights to avoid additional operational overhead and the risk of manual error. Leveraging Active Directory for big data identity and access management can solve both issues.\nThe Pitfalls of Using Kerberos to Secure Hadoop \nDevelopers are aware of the implications of deploying Hadoop in non-secure mode\u2014and many have implemented a secure mode, including incorporating Kerberos to authenticate users and services from one node to the next. While this is a step in the right direction, it still presents a variety of challenges for enterprise IT organizations.\nEven with the incorporation of Kerberos, Hadoop continues to run in non-secure mode by default. To use Kerberos, organizations must go through the time-consuming, error-prone, multi-step process of setting up an MIT Kerberos environment. Once set up, organizations need a way to centrally manage user access. Without it, user accounts must be set up on each of the hundreds or thousands of nodes inside the organization\u2019s multiple Hadoop clusters.\nAnd it doesn\u2019t end there. Like regulatory compliance requirements, the Hadoop ecosystem is highly dynamic. Each time the environment or a regulatory requirement changes, so must the user access rights\u2014making the job of managing access increasingly complex.\nFinally, setting up a Kerberos environment also means creating a parallel identity infrastructure that\u2019s redundant to most organizations\u2019 Active Directory environments. This means any changes to a user\u2019s role and responsibilities must be applied to two identity management environments.\nLeveraging Existing Identity Infrastructure for Big Data Security Authentication\nA better approach to securing Hadoop production deployments involves using a solution that takes advantage of an existing Active Directory infrastructure, which already provides Kerberos authentication capabilities. Using this centralized, cross-platform identity management infrastructure solution allows IT organizations to grant access to Hadoop clusters using existing identities and group memberships, versus creating new identities for users across every Hadoop cluster.\nThis approach also allows organizations to leverage existing skill sets and management processes to set up user accounts and access to big data nodes, and helps reduce costs and the risk of error, in turn improving security. Using existing Active Directory accounts to log in also secures Hadoop environments while helping to prove compliance in a repeatable, scalable, and sustainable manner.\nHow it Works\nActive Directory deployments are often complex, but a unified identity management solution can simplify and streamline connecting and managing non-Windows servers in complex Active Directory environments. Through Hadoop integration, an identity management solution can connect Hadoop clusters to the existing Active Directory infrastructure. Once cluster nodes are integrated, automated authentication from one node to the next only requires the addition of new service accounts.\nA unified identity management solution can also automate Hadoop service account management. The power of the Active Directory\u2019s Kerberos and LDAP capabilities are extended to Hadoop clusters, in turn delivering authentication for both Hadoop administrators and end users. If privileges are already defined and associated with Active Directory users, they can be reused in the Hadoop environment. When users log in to Hadoop through Active Directory, they receive the same privileges and restrictions they are assigned outside the Hadoop ecosystem. This single sign-on capability helps increase user productivity as well as overall security.\nIn addition to access management across Windows, Linux, and Unix servers, a unified identity management solution provides privilege management and auditing capabilities that can be extended across the entire organization\u2014including the Hadoop environment. The solution can control access, manage privileges, audit activity, and associate everything back to an individual Active Directory account. The system also generates reports indicating who has access and who did what across Hadoop clusters, nodes, and services to help address compliance and audit requirements.\nThe Bottom Line\nSecurity is crucial for big data deployments, but it must be applied in a way that is both efficient and reliable. Implementing an identity management solution that integrates with an organization\u2019s existing Active Directory infrastructure meets both requirements. Using this approach, organizations avoid setting up a siloed identity infrastructure just for Hadoop, and instead leverage a trusted solution that delivers group-based access controls for Hadoop cluster access management. This provides tightly enforced access controls and centrally managed least-privilege security policies for their Hadoop environment.\nTo learn more about how Centrify\u2019s unified identity management solution can help you secure your big data deployments, download the white paper How Identity Management Solves Five Hadoop Security Risks.