Saturday, July 4, 2015

Business Problems: Overcoming the Confidentiality and Integration challenges with Hadoop Clusters using Centrify


Hadoop implementations present multiple challenges to enterprises at the Operating System layer(*), I like to categorize them in 2 areas:


  • Hadoop clusters are unsecure by default.  What that means is that there's no service-to-service authentication and that privileged users have access to world-readable information and can elevate to privileged Hadoop accounts.
  • Multiple clusters are needed because of the development nature of the apps.  Typically at least a DEV/QA and PROD environments are needed, depending on the risk profile of the organization, each environment may be in different isolated environments and require different access control rules.
  • Different types of users need access:  From the SysAdmin, to the Hadoop Admin, to the Data Scientist, they all have different access and privileged needs.
  • The data classification of the business intelligence may require additional controls.  What if the cluster crunches Personal, Financial, Health, Energy or Card data?   SOx, PCI, HIPAA, NERC or FERC compliance is needed.


  • Kerberos:  Many organizations balk at the proposition of standing-up a separate MIT Kerberos implementation; and even if AD is an option, test environments may be in a one-way trust.
  • Different organizations == Different requirements, therefore the devil is in the details:
    • Process
    • Technology/Infrastructure
    • People
    • Regulations
(*) There are additional security challenges, like how to protect data at the Hadoop layer, for this, your trusted Hadoop vendor (Cloudera, Hortonworks, MapR, etc) have an ecosystem of applications.  Centrify can provide identity information to those apps.

Technical Briefing

The following videos provide technical demos on how Centrify can overcome these challenges

Putting it all together

  • Centrify allows for OS level integration for Linux and UNIX systems that enables:
    • Centralized Administrations of multiple Hadoop Clusters
    • Regardless of how complex your AD may be 
    • No schema extensions or software in domain controllers
    • Using UNIX frameworks
    • Kerberos just works out of the box
    • Leverage AD fully:  Kerberos, Group Policy, PKI
  • Centrify enables the implementation of strong access controls to enforce
    • Least access
    • Least privilege (RBAC- not password Centric)
    • Easy attestation and reporting
    • Separation of Duties
    • Works on Windows to eliminate the problem of the persistent administrator
  • For environments with Personal, Financial, Health or Card data
    • Session transcription 
    • Session replay
    • Event consolidation
    • Works on Windows
  • Hadoop-exclusive features:
    • adkeytab for advanced keytab/service account provisioning
    • Kerberos infinite ticket renewal parameters and GPOs
    • LDAP Proxy to assist apps like Sentry, Hue and Knox
    • Partner with Cloudera, MapR and Hortonworks

Centrify + AD + Hadoop = faster, more secure and regulation-aligned big data projects.

No comments:

Post a Comment