Thursday, September 17, 2020

Service Accounts suck - why data futures require end to end authentication.

 Can we all agree that "service" accounts suck from a security perspective.  Those are the accounts that you set up so what system/service can talk to another one.  Often this will be a database connection so the application uses one account (and thus one connection pool) to access the database.  These service accounts are sometimes unique to a service or application, but often its a standard service account for anything that needs to connect to a system.

The problem with that is that you've therefore got security defined at the service account level, not at all based on the users actually using it.  So if that database contains the personal information of every customer then you are relying on the application to ensure that they only display the information for a given customer, the security isn't with the data its with the application.

Back in 2003 a group called the "Jericho Forum" set up under Open Group to look at the infrastructural challenges of de-perimeterisation and they created a set of commandments the first of which is:
The scope and level of protection should be specific and appropriate to the asset at risk. 

Service accounts break this commandment as they take the most valuable asset (the data) and effectively remove security scope and place it in the application.   What needs to happen is that the original requestor of the information is authenticated at all levels, like with OAuth, so if I'm only allowed to see my data then if someone makes an error in the application code, or I run a Bobby Drop Tables attack, my "Select *" only returns my records.

This changes a lot of things, connection pooling for starters, but when you are looking at reporting in particular we have to get away from technologies that force systems accounts and therefore require multiple security models to be implemented within the consumption layer.

The appropriate level to protect data is at the data level and the scope is with the data only by shifting our perception of data from being about service accounts and databases to being about data being the asset can we start building security models that actually secure data as an asset.

Today most data technologies assume service accounts, this means that most data technologies don't think that data is an asset.  This has to change.

Thursday, August 27, 2020

Getting RocksDB working on Raspberry PI (Unsatisfied linker error when trying to run Kafka Streams)

 If you are here its probably because you've tried to get RocksDB working on a Raspberry PI and had the following exception:

Exception in thread "main-broker-b066f428-2e48-4d73-91cd-aab782bd9c4c-StreamThread-1" java.lang.UnsatisfiedLinkError: /tmp/librocksdbjni7453541812184957798.so: /tmp/librocksdbjni7453541812184957798.so: cannot open shared object file: No such file or directory (Possible cause: can't load IA 32 .so on a ARM platform)

at java.base/java.lang.ClassLoader$NativeLibrary.load0(Native Method)

at java.base/java.lang.ClassLoader$NativeLibrary.load(ClassLoader.java:2452)


The reason for this is that your rocksdbjni jar file doesn't include the required shared library (.so file) for the ARM platform, it doesn't include x86 for Linux, a Windows DLL and a PPC(!) library but not for Linux so you are going to have to roll your own.

Step 1 - Doing the apt-gets

You have to do the build ON the Raspberry PI as its C code that needs to be compiled and the instructions for rocks aren't overly helpful as they assume a couple of things are already installed.
The following appear to be required:
sudo apt-get install cmake

sudo apt-get install vagrant

sudo apt-get update --fix-missing

sudo apt-get install vagrant


The vagrant part appears to be not actually required but I had issues when it wasn't installed so it is probably a dependency of the apt-get pulls in.


Step 2 - getting the code for RocksDB

Now you are going to have to download the code for RocksDB, I hope you've got a fairly large SD card as this will take up around a gig of space all told

Make a directory under your home directory (I call mine 'dev') and change into the directory (cd dev).

git clone https://github.com/facebook/rocksdb.git rocksdb


Then change directory into rocksdb.  If you are a Java developer you are possibly about to discover for the first time a tool that is why Java people created ant and maven because they REALLY hated how it worked...


Step 3 - running make


Then you want to run 

make rocksdbjavastaticrelease


Which will take a while but produce you a nice JNI jar file for your Raspberry PI that you can add to your classpath and you will be away.... the jar file will be under the "target" directory under rocksdb, so if you are on the default install with user "pi" it will be 

/home/pi/dev/rocksdb/java/target/rocksdbjni-6.12.0-linux32.jar


Copy that file into your classpath and rocks DB should now work fine


Step 4 - if running a Kafka Streams application

I'm sure there is a logical reason for this and that I could have created another work around, but having run the original and had the linker error when it accessed RocksDB there was a new error when running Kafka Streams:

Exception in thread "main-broker-00a490b4-b50a-4cc4-aded-d6324ad0f291-StreamThread-1" org.apache.kafka.streams.errors.ProcessorStateException: Error opening store reading-topic-STATE-STORE-0000000000 at location /tmp/kafka-streams/main-broker/0_0/rocksdb/reading-topic-STATE-STORE-0000000000

at org.apache.kafka.streams.state.internals.RocksDBTimestampedStore.openRocksDB(RocksDBTimestampedStore.java:87)

at org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:188)

at org.apache.kafka.streams.state.internals.RocksDBStore.init(RocksDBStore.java:224)

at org.apache.kafka.streams.state.internals.WrappedStateStore.init(WrappedStateStore.java:48)

at org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStore.init(ChangeLoggingKeyValueBytesStore.java:42)


This is weird as it had never successfully run (as it couldn't create the DB), but for some reason there is state somewhere (I assume in the broker) that still exists for the app id, I assume because that runs before the linker.  So the fix there is to change "application.id" in the kafka configuration file to be something new and it should all run fine.