Saturday, April 05, 2008

Information Oblivion - data only counts where it works

Reading a few email lists recently I've noticed a worrying trend around SOA that can be compared the the "struct" problem of OO. People are trying to look at information independently from its business context. Often this is the single canonical form mentality but its part of a broader problem where people (often called information architects) push forwards an idea that the information is the important bit and that all systems do is move information around and update the information.

A lot of these efforts try and treat the data as an independent entity and aim to create data sources which will act independently and therefore be able to be "used" across the enterprise.

The problem with this is that it works okay for after the fact data, so having a single Customer data source could work, if its just about the basic customer information. Having a historical record of orders is also okay. The point about these bits of data is that they are about recording what has happened. Where this approach falls down is when you try and apply that approach to what is happening. The problem here is that this temporal information only makes sense in the context of the current execution.

Disconnecting data from the business service that manipulates it just means that you have to put the interpretation logic (the bit in the service that understands what the fields mean at different stages) needs to either shift into the data service (so its not a data service) or into every single service that uses the data (which is bonkers).

The basic philosophy that has served me well is that data only counts where it is used. After the fact reporting is a data centric element and suits data stores, this is because the use is really about the information and just shifting, its structs, but where the data is related to activities then keep the data close to the action, dump it into the store later but don't do that until you've finished doing what you need with it.

Information is power, but only if you act on it correctly.

Technorati Tags: ,

1 comment:

Anonymous said...

Couldnt agree more.

In the Data Warehousing world there is always some whiz kid trying (and unfortunatly ofetn succeeding) to force "the one perfect data model" design pattern.

What these otherwise intelligent persons fail to realise is that users of the datawarehouse are most often also users of the less than perfect source systems. They work everyday with the "imperfect" "legacy" data models and the "perfect" data model just seems a bit off to them.

Worse is the canonical renaming of all the entites. "brick" might seem like a stupid name for a part of a salesmans territory but if a company has been using the terminoligy for years why suddenly refer to "sales sub area".

And how arrogent is it for someone to arrive at company and insist on a new name for everything?