Friday, October 24, 2008

iPhone as a corporate device

Okay, since the 3G came out I've been using it as a corporate device which means email/calendar/contacts. Previously I was on a windows mobile device with a slide out keyboard, which is touted sometimes as being the "best" for corporate comms.

As a corporate device the iPhone was simply superb. There have been some people citing some minor security pieces but seriously the odds of my losing the iPhone as opposed to the last device? My last device was stolen at Seattle Airport by a pick-pocket. My iPhone however was in a more secure pocket... because I care about it.

So lets just look at usability.

Exchange.... well the push sync was as good as expected and it burnt the battery as expected. But the big question is whether the touch keyboard was as "good" as the old slide out. Once yes, Twice yes, a hundred times yes. Simply put its become a joy to use. The only slight niggle I have is that cut-and-paste would help.

Calendar was great, contacts worked great, wish there was voice dialing as standard but its still superb.

And the best bit?

ITS A BRILLIANT PHONE

Which is where most of the Windows Mobile devices I've had fall down. They've been okay as a mobile office client but rubbish as phones. The iPhone is a good phone. It switches to phone mode really well (and from anywhere) the call quality is good and its never dropped a call.

So my conclusion is that not unsurprisingly the iPhone is a great corporate device because its a usable device. Now what would really help for corporate adopt is a "local" version of the iTunes store as a way to manage enterprise client application deployment, deploying corporate apps has long been a pain onto mobile devices, either "click on this link and pray" or "Insert this SD card" with the challenges around updates that this brings. A Corporate iTunes store and iTunes extension would solve all of that. No idea if Apple would do it, but that would turn them from being the cool device in the corporation to being the managed cool device in the corporation.




Technorati Tags: ,

Wednesday, October 22, 2008

A note to vendors on the Church Turing Thesis

At work I regularly get vendors throwing about words like "unique", "revolutionary", "game changing", while of course promising to reduce costs, increase agility and pretty much everything else bar world hunger.

I normally ask standard questions in reply at this point, mainly about what it build upon. What really annoys me is when people then say "its completely new".

No it isn't, its an evolution of something, it might be a clever idea but its not going to be a completely and utterly new solution that no-one in the whole world has ever done anything like it before. Let me introduce you to the Church Turing thesis
Every effectively calculable function (effectively decidable predicate) is general recursive


In other words just TELL me what you are building on and I will have a lot more respect and a lot more interest than just telling me that it is 100% new and original. Snake-oil salesman try that approach a decent vendor should know the roots of their product and be able to explain the previous things that they have built upon. If Isaac Newton was admit standing on the shoulders of giants then a software vendor had better be standing on shoulders of something if they want to be credible.

Your product might be great, but if you can't say from where it comes then I'll just cry snake-oil and ignore you.



Technorati Tags: ,

Saturday, October 18, 2008

Now we're caching on GAs

GAs = Google AppEngineS. Rubbish I know but what the hell. So I've just added in Memcache support for the map so now its much much quicker to get the map. Now the time taken to do these requests isn't very long at all (0.01 to do the mandelbrot, 0.001 to turn it into a PNG) but the server side caching will speed up the request and lead to it hitting the CPU less often...

Which means that we do still have the option of making the map squares bigger now as it won't hit the CPU as often, which means we won't violate the quota as often.

So in other words performance tuning for the cloud is often about combining different strategies rather than one strategy that works. Making the squares bigger blew out the CPU quota, but if we combine it with the cache then this could reduce the number of times that it blows the quota and thus enables it to continue. This still isn't effecting the page view quota however and that pesky ^R forces a refresh and the 302 redirect also makes sure that its still hitting the server, which is the root of the problem.

Technorati Tags: ,

Resize not helping

Well the resize to 100x100 isn't working because its now crapping out because I've got 200+ requests that all smash the CPU limit which blows another quota limit.

So basically you can bend the CPU limit, but if you send 200+ requests in a couple of seconds which all break the limit then that one kicks in as well.

Back to 54x54 and some extra planning.

Technorati Tags: ,

Google App Engine performance - Lots and Lots of threads

Just a quick one here. While Google App Engine's Python implementation limits you to a single thread it certainly isn't running a single thread and servicing requests from it. When running locally (where the performance per image is about the same) it certainly does appear to be single threaded as it takes an absolute age and logs are always one request after another. On the server however its a completely different case with multiple requests being served at the same time. This is what the Quota Breaking graphs seem to indicate as its servicing 90 requests a second which would seem to indicate that the App Engine just starts a new thread (or pulls from a thread pool) for each new request and is spreading theses requests across multiple CPUs. The reason I say that later piece is that the performance per image is pretty much the same all the time which indicates each one is getting dedicated time.

So lots of independent threads running on lots of different CPUs but probably sharing the same memory and storage space.

Technorati Tags: ,

Google App Engine, not for Web 2.0?

Redirect doesn't help on quota but something else does...



The block size has increased from 54x54 to 100x100 which still fits within the quota at the normal zoom (but is liable to break quota a bit as we zoom in). This moves the number of requests per image down from 625 to 225 which is a decent drop. Of course with the redirect we are at 450 but hopefully we'll be able to get that down with some more strategies.

The point here is that when you are looking to scale against quotas it is important to look at various things not simply the HTTP related elements. If you have a page view quota the easiest thing to do is shift bigger chunks less often.

One point that this does mean however is that Google App Engine isn't overly suited to Web 2.0 applications. It likes big pages rather than having a sexy Web 2.0 interface with lots and lots of requests back to the server. GMail for instance wouldn't be very good on App Engine as its interface is always going back to the server to get new adverts and checking for new emails.

So when looking at what sort of cloud works for you, do think about what sort of application you want to do. If you are doing lots of Web 2.0 small style AJAX requests then you are liable to come a cropper against the page view limit a lot earlier than you thought.

Technorati Tags: ,

Redirecting for caching - still not helping on quota

As I said, redirect isn't the solution to the problem but I thought I'd implement it anyway, after all when I do fix the problem its effectively a low cost option anyway.



What this does is shifts via a redirect (using 302 rather than 301 as I might decide on something else in future and let people render what ever they want) to the nearest "valid" box. Valid here is considered to be a box of size (width and height) of a power of 2 and based around a grid with a box starting at 0,0. So effectively we find the nearest power of 2 to the width then just move down from the current point to find the nearest one. Not exactly rocket science and its effectively doubling the number of hits.

Technorati Tags: ,

A generic VM that we can run VMs on

A call to arms for a generic VM was put up by Neil McCallister. Now part of this is the "aren't dynamic languages great" camp which really doesn't like operational costs to be factored in.

The other bit though is that this is just funny, the x86 CPU is effectively a generic VM, its what IBM called the microprocessor when they first created the because unlike dedicated previous approaches it could be many different machines just via programming. A "Virtual" Machine in fact.

Now we have the JVM and the CLR, they are both "generic" VMs and one has a much broader platform support and the other has more languages (still not seeing the research around language heterogenity being a good thing BTW). Sun is getting happy clappy with the scripting world and putting them on the JVM so we already sort of have that generic VM.

What a generic VM platform will lead to is people creating specific VMs running ontop of the generic VM for specific purposes. Not sure that this is a good thing but it will happen.

The JVM is just another machine, its a generic VM, all you have to do is write the language port. I knew this in January 1997 so it really isn't rocket science.

If you want your favourite scripting language to run on a "generic" VM then port it to the JVM or the CLR. What is all the fuss about?

Technorati Tags: ,

Friday, October 17, 2008

Redirects and the lack of precision

Okay the redirect plan is out of the window because the mathematical precision isn't good enough and we get into infinite redirect loops... Back to plan A which is to calculate a standard bottom left.

20.5 * 0.2 = 4.10000000000000005

No it ruddy doesn't it is 4.1

20.5 / 5 = 4.0999999999999996

Java is just as bad at this so its not a Python fault (although with Java at least its consistently bad, Python is inconsitently bad as it just defaults to the C implementation on the box).

Technorati Tags: ,

Caching strategies when redirect doesn't work

So the first challenge of cachability is solving the square problem. Simply put the created squares need to be repeatable when you zoom in and out.

So the calculation today is just find the click point and find the point halfway between that and the bottom left to give you the new bottom left.

The problem is that this is unique. So what we need to find is the right bottom left that is nearest to the new bottom left.


Now one way to do this would be to do a 301 redirect for all requests to the "right" position. This is a perfectly valid way of getting people to use the right resource and of limiting the total space of the resource set. What you are saying in effect is that a request for resource X is in fact a request for resource Y and you should look at the new place to get it. This works fine in this scenario but for one minor problem.

The challenge we have is page views and a 301 redirect counts as a page view, meaning that we'd be doubling the number of page views required to get to a given resource. Valid though this strategy is therefore it isn't the one that is going to work for this cloud application. We need something that will minimise the page views.

But as this is a test.... lets do it anyway!

Technorati Tags: ,

Google App Engine performance under heavy load - Part 4

Okay so the first time around the performance under relatively light load was pretty stable. Now given that we are at the Quota Denial Coral would this impact the performance?


First off look at the scale, its not as bad as it looks. We are still talking normally about a 10% range from Min to Max and even the worst case is 16% which is hardly a major issue. But is this a surprise?

No it isn't. The reason is that because of the quota that we are breaking (page views) we aren't actually stressing the CPU quota as much (although using several milli-seconds a second indicates that we are hitting it quite hard). That said however it is still pretty impressive that in a period of time where we are servicing around 90 requests a second the load behaviour is exactly the same, or arguably more stable as the min/max gap is more centred around the average, than when it is significantly lower.

So again the stability of performance of App Engine is looking pretty good independent of the overall load on the engine.

Technorati Tags: ,

Thursday, October 16, 2008

HTTP Cache and the need for cachability

One of the often cited advantages of REST implemented in HTTP is the easy access to caching which can improve performance and reduce the load on the servers. Now with breaking app engine quota regularly around Mandel Map the obvious solution is to turn on caching. Which I just have by adding the line

self.response.headers['Cache-Control'] = 'max-age=2592000'


Which basically means "don't come and ask me again for a week". Now part of the problem is that hitting reload in a browser forces it to go back to a server anyway but there is a second and more important problem as you mess around with the Map. With the map, double click and zoom in... then hold shift and zoom out again.



Notice how it still re-draws when it zooms back out again? The reason for this is that the zoom in calculation just works around a given point and sets a new bottom left of the overall grid relative to that point. This means that every zoom in and out is pretty much unique (you've got a 1 in 2916 chance of getting back to the cached zoom out version after you have zoomed in).

So while the next time you see the map it will appear much quicker this doesn't actually help you in terms of it working quicker as it zooms in and out or in terms of reducing the server load for people who are mucking about with the Map on a regular basis. The challenge therefore is designing the application for cachability rather than just turning on HTTP Caching and expecting everything to magically work better.

The same principle applies when turning on server side caching (like memcache in Google App Engine). If every users gets a unique set of results then the caching will just burn memory rather than giving you better performance, indeed the performance will get slower as you will have a massively populated cache but have practically no successful hits from requests.

With this application it means that rather than simply do a basic calculation that forms the basis for the zoom it needs to do a calculation that forms a repeatable basis for the zoom. Effectively those 54x54 blocks need to be the same 54x54 blocks at a given zoom level for every request. This will make the "click" a bit less accurate (its not spot on now anyway) but will lead to an application which is much more effectively cachable than the current solution.

So HTTP Cache on its own doesn't make your application perform any better for end users or reduce the load on your servers. You have to design your application so the elements being returned are cachable in a way that will deliver performance improvements. For some applications its trivial, for others (like the Mandelbrot Map) its a little bit harder.


Technorati Tags: ,

Wednesday, October 15, 2008

Google App Engine - Quota breaking on a normal day

Okay after yesterday's quota smashing efforts I turned off the load testing and just let the normal website load go but with the "standard image" that I use for load profiling being requested every 30 seconds. That gives a load of less than 3000 requests in addition to the Mandel Map and gadget requests.

So its pretty clear that requests are WAY down from yesterday at a peak of under 8 requests a second which is down from a sustained load of around 90 requests a second. So how did this impact the quota? Well it appears that once you break the quota that you are going to get caught more often, almost like you get onto a watch list.
Interestingly though you'll note that again the denials don't match straight to demand. There is a whole period of requests where we have no denials and then it kicks in. This indicates that thresholds are being set for periods of time which are fixed rather than rolling, i.e. you have a 24 hour block that is measured and that then sets the quota for the next 24 hour block rather than it being a rolling 24 hour period (where we'd expect to see continual denails against a constant load).
Megacycles are again high on the demand graph but non-existant on the quota graph and the denails don't correspond directly to the highest CPU demand periods. So it does appear (to me) that the CPU piece isn't the issue here (even though its highlighting the number of 1.3x quota requests (that standard image)) but more testing will confirm that.

The last test was to determine whether the data measurement was actually working or not. Again we see the demand graph showing lots of data going forwards and backwards with nearly 4k a second being passed at peak. It takes about 3 Mandel Map requests to generate 1MB of data traffic so it certainly appears that right now Google aren't totting up on the Bandwidth or CPU fronts, its about the easy metrics of page requests and actual time. They are certainly capturing the information (that is what the demand graphs are) but they aren't tracking it as a moving total right now.

Next up I'll look at the performance of that standard image request to see if it fluctuates beyond its 350 - 400 milliseconds normal behaviour. But while I'm doing that I'll lob in some more requests by embedding another Mandel Map



Technorati Tags: ,

Google App Engine - breaking quota in a big way

Okay so yesterday's test was easy. Set four browsers running with the Reload plug-in set for every 10 seconds (with one browser set to 5 seconds). This meant that there would be 18,780 hits a minute. Now there are a bunch of quotas on Google App Engine and as I've noticed before its pretty much only the raw time one
that gets culled on an individual request.

So it scales for the cloud in terms of breaking down the problem but now we are running up against another quota. The 5,000,000 page views a month. This sounds like a lot, and it would be if each page was 1 request, but in this AJAX and Web 2.0 each page can be made of lots of small requests (625 images + the main page for starters). Now Google say that they throttle before your limit rather than just going up to it and stopping... and indeed they do.

That shows the requests coming in. Notice the two big troughs? That could be the test machines bandwidth dropping out for a second, or an issue on the App Engine side. More investigation required. That profile of usage soon hit the throttle

This looks like you can take a slashdotting for about 2 hours before the throttle limit kicks in. The throttle is also quickly released when the demand goes down. The issue however here is that it isn't clear how close I am to a quota and how much I have left, there isn't a monthly page count view and, as noted before, the bandwidth and cycles quotas don't appear to work at the moment
It still says I've used 0% of my CPU and bandwidth which is a little bit odd given this really does cane the CPU. Bug fixing required there I think!

So basically so far it appears that App Engine is running on two real quotas, one is the real time that a request takes and the other is related to the number of page views. If you are looking to scale on the cloud it is important to understand the metrics you need to really measure and those which are more informational. As Google become tighter on bandwidth and CPU then those will become real metrics but for now its all about the number of requests and the time those requests take.

Technorati Tags: ,

Monday, October 13, 2008

Google App Engine - How many megas in a giga cycle?

Well the Mandel Map testing is well underway to stress out the Google cloud. 12 hours in and there is something a bit odd going on....



Notice that the peak says that it was doing 4328 megacycles a second, and its generally been doing quite a bit with the 1000 megacycle a second barrier being broached on several occasions.

Now for the odd bit. According to the quota bit at the bottom I've used up 0.00 Gigacycles of my quota. Now the data one looks a little strange as well as it is firing back a lot of images and its not registering at all. So despite all of that load I've apparently not made a dent in the Google Apps measurements for CPU cycles or for data transfer. To my simple mind that one peak of 4328 megacycles should be around 4 Gigacycles however you do the maths. It really does seem to be a staggering amount of CPU and bandwidth that is available if all of this usage doesn't even make it to the 2nd decimal place of significance.

So here it is again to see if this helps rack up the numbers!



Technorati Tags: ,

I'm not sure if this depresses me or not...

Nanite is a Ruby thing. Apparently its new
Nanite is a new way of thinking about building cloud ready web applications. Having a scalable message queueing backend with all the discovery and dynamic load based dispatch that Nanite has is a very scalable way to construct web application backends.


Now to naive old me it sounds like something not really as good as Jini. Its difference appears to be that it dispatches work to "least loaded nodes" rather than a Jini/Javaspaces approach which would have workers determine who was the least loaded and then retrieve the work from the pool based on when they complete their previous task.

Having a scalable message queueing backends isn't new nor is it unusual to use these to do load-balancing. Some people have moved beyond this for cloud computing.

They say that those who don't learn from history are doomed to repeat it, it IT I think we can add the phrase "poorly".

I'm not that depressed because at least it is using a messaging infrastructure for the dispatch and not using a database as a sync point or something else strange. It would be nice though if the number of times we see "new" technologies that were not as complete as those we used in production 8 years ago was less than the number we see that are clearly better. Currently it appears to be 10 "old/new" inventions to every 1 new invention.... at best.

Now maybe I'm wrong about nanite (which would be nice) and it really is better than all of these older technologies and is a brand new way to think about the cloud, it can't just be an agent/dispatcher pattern can it? If it helps people great, if they learn from some of the other projects that have done this sort of thing before then super great, if we get some genuinely new advances around work distribution then fantastic. I just wish projects would not start with the assumption that they are re-defining how people work in a given area without reference to what they have improved upon.


Technorati Tags: ,

Sunday, October 12, 2008

Killing a cloud - MandelMap

One of the things talked about with cloud computing is "horizontal scalability" which basically means being able to do lots of small things that are independent of each other. Now some tasks break down horizontally pretty well, others are a bit more of a chore but the challenge is normally how to re-combine those pieces into the overall set of results.

Horizontal scaling in the cloud also requires you to know when something is reaching its peak so you can add new servers. This could be done by the cloud provider, namely they see you are running out of space and they spin up another instance and ensure that it all runs across the same shared data store. This is fine, unless you are paying for those new instances and suddenly find yourself a few thousand dollars a minute down as a result of some coding problems.

Now the Google cloud forces you down this path even more as the last set of Performance metrics showed. Its focus is really short lived transactions which gave a big issue for the Mandelbrot set which is very computationally intensive. Even the little gadget on the right hand side is running at 1.3x quota.

So the solution is then to break the problem down further. Fortunately the Mandelbrot set is perfect for that as each individual point has a known value. So you can break the problem down as far as you want. I went for a series of 54x54 (the gadget is 200x200 so its about 1/13th the number of calculations) squares to create a bigger image. Then I got a little bit silly, thanks to GWT and decided to build a "Mandel Map" in other words use some of the infinite scrolling bits of Google Maps but applied to a Mandelbrot set.



Double click to zoom, shift and double click to zoom out. Drag to scroll around the map. Now the advantage of using the small bits becomes apparent, you can zoom in for miles and still remain within the quota set by Google but the actual effect is that you have a bigger image being displayed. Effectively here the browser is being used as a "late integration" approach to recombine the information at the point of display and act as the point of division for the problem into its smaller chunks. Now this is possible in part thanks to clearly parameterised URIs which can be used by clients to breakdown the problem. This puts a degree of control into the client but has the knock-on effect of more tightly coupling the two together.

Part of the reason for this is to see how Google copes with having a real horizontal hammering (the grid is 5x5x(5x5) so 625 requests for the basic image) and to provide a set of base data for the next challenge which is around how different types of caching can impact performance and how you actually need to design things with caching in mind. The test here is to see how well the standard load (that gadget at the side) copes while people are playing around with the map. This gives us an idea of what the peak scaling of a given image is and whether Google are doing something for us behind the scenes or not.

Yes its not perfect and it could be done in smaller chunks (etc etc) and maybe if I can be bothered I'll tune it. But its just to demonstrate a principle and took a whole couple of hours, if you want the full screen version, well here is MandelMap.

Technorati Tags: ,

Monday, October 06, 2008

Compromising for failure - anti-pattern

Description
This is the anti-pattern where people making compromises to, mainly, business stakeholders in order to get things accepted. Unfortunately this leads to a bloating of the programme and a lack of clarity in its goals.

Causes
The causes for this tend to be political intransigence and gaming. Someone will object in order to get their own pet project on the agenda, a political game will then ensue leading to the person getting their pet project in order to support the overall programme, this will be repeated over and over again leading to multiple pet projects being added to the cost, and timescales, of the overall programme.

Effects
A good example of this sort of anti-pattern is the US Congress where "pork" is regularly attached to bills in order to get certain senators or congressmen to vote in favour. The recent $700bn bailout bill for instance had a huge amount of added pork what this means of course is that things get more expensive than originally budgeted and become more complex to administer (more cost again). It also means in delivery projects that there becomes a lack of focus on the key goals of the programme and instead there becomes a focus on getting the pet projects done well to keep "Person X" happy.
Rapidly the programme descends into a fractured mess as the pet project sponsors care only about those elements and not at all about the over all objectives. Team members realising that success can be achieved via a pet project delivery also focus on those areas as its a more immediate return. The overall clarity of the project is lost and it loses direction and ultimately fails.

Resolution
The first part of the resolution is making sure you have a very high level sponsor who can drive over pet projects. To do this you need a clear, and concise, vision that a senior exec will sign off on and then you need to be clear in what does, and what doesn't, drive you towards that objective. The next bit is getting a backbone. Its very easy to give in and put a pet project in, its harder to stand up and say "no, it doesn't fit here" and the key to that is firstly doing it privately making clear your objections and then secondly doing it to the overall sponsor and finally that sponsor then doing it publicly. Rapidly people will stop trying to force in the pork if they find themselves accountable for trying to push it in.

The final bit is about culture, you need to establish a culture of "do one thing well" that focuses people around simple clear objectives and makes any pork pushing attempt ridiculously visible. The more concise your vision and objectives the harder it is for the pork to be added.

Technorati Tags: ,

Wednesday, October 01, 2008

Boiling the SOAcean - anti-pattern

Description
This is the SOA strategy that tries to much and sets its aspirations far too high. Often it starts as a little point problem before some one suggests that "we could do this as well" before long the SOA strategy is listing solving world hunger and the middle east peace process as secondary objectives that will come from meeting the now agreed two objectives of the project, interstellar space travel and universal harmony.

Causes
There are two basic causes of this problem. Firstly its powerpoint driven aspirational stuff with the people involved knowing they have no real worries about actually having to deliver the stuff they are writing about. Secondly its a belief that there is infinite budget available at least in the long term which is the key part of the this anti-pattern. Planning is for a future in which everything is fixed so you might as well get it down on paper today, this leads people to flights of fancy on trying to conceive of all possible outcomes and how SOA could be used to solve them. The centre to this anti-pattern is the lack of grounding that it has in reality, when combined with aspirational planning this leads to the sort of programme that is doomed to under achieve as it would be impossible to deliver.

Effects
The biggest effects are those of perception, firstly people will think that the programme is over stretching itself (they'd be right) and secondly will be disappointed when it fails to get close to its goals. The lack of focus in the aspirational planning often means that expensive investment is made in infrastructure for potential future requirements. These projects are often cancelled after making that infrastructure investment but not delivering the level of benefits expected for the spend.

Resolution
The key to resolving this problem is grounding it in reality. Push the programme into tight iterations (no more than 3 months) with each iteration delivering quantifiable business benefits this helps to move the focus away from infrastructure and towards the end game and also helps to shift the event horizon of the programme into something more manageable, around a year is ideal. The next bit is to adopt YAGNI as a core principle, infrastructure is procured no more than 1 iteration ahead (i.e. if you are on 1 month cycles then you procure at 2 months) with the first iteration being a selection process.

The final piece is making people get their hands dirty, aspirational planning comes from people who deliver to powerpoint rather than to live, make them responsible for go-lives and it will focus their mind on achievable objectives.

Technorati Tags: ,

We're all going on an architecture hunt

I'm not scared, we can't go over it, we can't go under it we'll have to go through it.

Duane talk about Forensic architecture. Its a great read on the lessons that he has learnt. The only thing I'll add is that I don't think we are CSI, we are more Bones, hell its pretty much Jurassic Park out there. Documenting architecture after the fact, as Duane says, isn't so much the exception, its the career.

Technorati Tags: ,

Pink Floyd meets Frank Sinatra - anti-pattern

Description
This anti-pattern is all about training and learning. Its the "we don't need no education" anti-pattern. It occurs when organisations want to "do everything themselves" and develop in isolation from other efforts and experiences. There can be some reading done but the predominant theme is that it will be done "My Way".

Effect
The effect of this is that the organisation starts creating its own definitions of terms. These are created in a "folksonomy" manner from local opinions rather than taking reference to external sources. Common elements are definitions of "services" that map to things that already exist (see Defensive SOA) rather than looking at a new approach. The other side is that when using technology the company will "learn themselves" how to do the implementation with very little involvement from either the vendor or experienced 3rd parties, the reason for this will be that the company wishes to "learn themselves" how to do SOA, the reality is that they are concerned that external inputs will demonstrate that their current approaches don't work.

The effect of this is that SOA becomes simply another label attached to the companies standard way of working and any changes are purely superficial. SOA rapidly becomes a "best practice we've done for years" and new technology projects fail to exploit the new technology successfully so little, or negative, improvement is seen in productivity.

Causes

The causes of this is an inward facing IT department which sees external companies, including vendors, as competitors rather than collaborators. Sometimes this is combined with a huge arrogance that the company knows best, but mostly its a mistaken belief that in order to remain in control you must do everything yourself. A large part of the issue comes from a fear of new ideas being demonstrated as measurably better than what went before and when tied up with a blame culture this results in a need to protect reputations over open collaboration and improvement.

Resolution
The first part of the resolution is to assess whether you really are that good. If you turn around and find that you are the favoured love child of Google and Amazon and Bill Joy, James Gosling and Donald Knuth are fit only to be junior project managers then carry on, you are right and you are the best. If however you only think that Bill Joy is rather smart, or even employ Bill Joy then remember his mantra at Sun "Most smart people work elsewhere". The second stage is to get rid of the Blame element. You've done an okay job and you want to do a better job. That needs to be seen as a good thing not a bad thing, it needs to be seen as the right behaviour. The next stage is a budgetary one, how will you measure the benefit of external spend and how will you justify it. You need to have clear cases on what you expect external spend to bring and how you will measure its impact. This way you keep in control and use external people to help you improve where you are either weak or where you don't want to waste your time.


Technorati Tags: ,