Tuesday, January 06, 2015

Securing Big Data - Part 1

As Big Data and its technologies such as Hadoop head deeper into the enterprise so questions around compliance and security rear their heads.

The first interesting point in this is that it shows the approach to security that many of the Silicon Valley companies that use Hadoop at scale have taken, namely pretty little really.  It isn't that protecting information has been seen as a massively important thing as there just aren't the basic pieces within Hadoop to do that.  Its not something that has been designed to be secure from day one, its designed to be easy to use and to do a job.  Its as governments and enterprises with compliance departments begin to look at Hadoop that these requirements are really surfacing.

But I'm not going to talk about the encryption and tokenization solutions, those are hygiene factors but not really about securing the information, because its still going to be accessible to people with the right permissions.  Because of that its still at risk from Cyber or Social Engineering attacks.  The means that security for Big Data is about layering and really its about two different ways of viewing security.

IT solutions, technical solutions, can really help around access control, encryption but they don't really help you to actually prevent information being stolen.  What stops information leaking is about the business decisions you make.

The first one of those is: what information do I actually store?  So you might decide that you'll just store IDs for customer information in Hadoop and then use a more traditional store to provide the cross reference of IDs back to customer information when its required.

Above this however is actually using Hadoop to protect Hadoop.  What is Hadoop?  Its a Big Data analytics platform.  What is the biggest threat around Data?  Social Engineering or Cyber 'inside the walls' attacks.  So the first stage in securing Hadoop is to do the IT basics, but the next stage of securing the business of information is about using the power of Hadoop to secure the information stored within it.

No comments: