Saturday, October 3, 2015

Business Problems: What does 'Secure' mean to you? Your answer may impact your ability to launch your BigData Project

We've had an opportunity to man the Centrify booth at the Strata+Hadoop show in New York this past week and we were constantly surprised by the focus on the core capability, but when we engaged passers by, the were unaware of the following facts:

  • A large majority of Big Data projects never make it into production
  • The most visible examples of Big Data projects that add value to organizations involve data that may contain Personal Identifiable Information (PII) or financial information.

It does not take a security expert to realize that these projects would be subject to the biggest security requirements.

When challenged, we heard answers like "but [management tool] can be integrated to Active Directory" (on the more informed side of the spectrum) or simply, people were focusing on areas like table-level security.

It became clear to me, that "to secure" a capability has a different meaning depending on who you ask (e.g. Big Data Scientist vs. Linux administrator vs. Security Analyst);  this is why at Centrify we always encourage the main stakeholder to bring in their infrastructure and security peers.

The key takeaway is that regardless of who you ask, there are reasons like
  • people: being able to find capable data scientists
  • data: availability of the data
  • cost
  • dynamics: integration with business processes
  • need: is it just a fad (business want to jump into the BigData bandwagon without real requirements)
But security will always come up, that's a common denominator across all Big Data.  We know this first hand because that's what we do.  Upwards of 60 of our customers are using Centrify to align their Big Data deployments with security requirements.

If you're tasked to launch a Big Data project, you have all the opportunity to get a leg-up on this problem and attack security early.

At a basic level you have to look at this in terms of layers.


To simplify, there's two of them:  The Identity/Infrastructure layer and the Big Data layer.  The concerns and focus are completely different.

OS Layer
I also like to call it the Identity/Infrastructure layer because it plays directly into what Identity and Access Management is set out to do:

a) Use a common directory to identify and authenticate users
b) Enforce the least access principle
c) Enforce the least privilege principle
d) Eliminate the human problem of shared accounts
e) Implement strong controls (e.g. end-to-end session auditing or multi-factor authentication) when needed
f) Be able to attest who has access to a system
g) Provide reporting and tools for attestation

Big Data Layer
I can't even pretend to advise on the Big Data layer, but all I know is this:
If you can't identify - you can't authenticate - if you can't authenticate, you can't authorize;  and if you can't authorize, you can't enforce strong access controls.  Regardless of Knox, Sentry and any other security initiatives at the Big Data layer, you need robust OS services to optimize those.

A promising future
We were also excited by the bright spots:  Cloudera, Hortonworks and MapR are taking security very seriously because they realized that this affects their ability to bring nodes into production.

Centrify is here to help!!!   Learn what we mean by "secure" - it's all about the OS Layer (Identity and Access Control)

Overcoming the Hadoop Security Challenges at the IAM layer with Centrify

No comments:

Post a Comment