Wednesday, January 07, 2015

Securing Big Data - Part 2 - understanding the data required to secure it

In the first part of Securing Big Data I talked about the two different types of security.  The traditional IT and ACL security that needs to be done to match traditional solutions with an RDBMS but that is pretty much where those systems stop in terms of security which means they don't address the real threats out there, which are to do with cyber attacks and social engineering.  An ACL is only any good if people do what they are expected to do.  Both the Target Credit Card hack and the NSA hack by Edward Snowden involved some sort of insider breach, so ACL approaches were part of the problem not the solution.

With Big Data and Hadoop we have another option: to actually use Hadoop to secure the information in Hadoop.  This means getting more information into Hadoop, particularly network and machine log information.

So why do you need all this information?  Well you need obviously to know the access requests and what data is accessed, but you also need to know where the network packets go.  Are they going to a normal desktop or corporate laptop or do they head out of the company and onto the internet?  Does the machine that is downloading information have a USB drive plugged in and the files being copied to it?  Is there a person logged into the machine or is it just a headless workstation?

The point here is that when looking at securing Big Data you need to take a Big Data view of security.  This means going well beyond traditional RDBMS approaches and not building a separate security silo but instead looking at the information, adding in information about how it is accessed and information about where that is accessed from and how.
This information builds up a single request view, but by storing that information you can start building up a profile of information to understand what other information has been requested and how that matches or can be linked.  Thus if someone makes 100 individual requests that on their own are OK but taken in aggregate represent a threat then its the storage of that history of requests that give you the security.

So to secure Big Data we don't need to just look at individual requests, the ACL model, we need to start building up the broad picture of how the information is accessed, what context it is accessed within and where that information ends up.  To secure Big Data, or in reality any data, is about looking at the business context rather than the technical focus of encryption and ACLs which are simply hygiene factors. 

1 comment:

Mary Gooven said...

I fully agree, that BIG DATA security is very important. As for me, I use for this purpose Ideals virtual data room. This service is very good protected. Moreover, it gives additional functions, such as online deals implementation and data sharing.