One of the bits I always find funny is the "X scales" pitch, whether it be stateless EJBs, REST or anything else its always one of the magic phrases. Mainframes scale, really quite effectively, they handle some very impressive numbers. Those Blue-Gene systems from IBM seem to scale pretty well too.
The key to the claims at scaling in most of these things is that you can throw more tin at the problem. Often this ignores the fact that there is a chumping database behind the scenes where scaling is a bit tricker, and more expensive, or they do smarts like Amazon's S3. The point is though that sometimes the unexpected happens and you have two options.
1) Scale to the possible peak that occurs in an exceptional circumstance
2) Prepare a static page for the exceptional circumstance
Sometimes, for instance if your website is the way you handle customers in the exceptional case, you have to go for the peak. Lots of times however its about getting information out.
As an example, the South East of the UK today was brought to a halt by the sort of snow levels that people in Boston would consider a "flurry" and the folks in Scandinavia would just shrug and walk on. This brought lots of the various sites down, for instance SouthEastern (my local rail company) had their site offline for most of the day.
What did they need to tell me? ALL TRAINS ARE CANCELLED INTO LONDON. But their dynamic site couldn't handle it. Later in the day they switched over to a PHP solution with a minimal (single) page on it but it took a good half of the day.
This is why people should always think about the ultimate fail-over for their sites. Sure you've scaled to some peak, but what if the worst happens and you get treble that peak? The answer is to switch to a file based approach, load that file into memory and just serve it as fast as you can, its amazing how many connections you can support when you are just returning a single static memory loaded page.
Some people will say "scale to that extraordinary peak" but you know what? 99.99% of the people hitting the site were looking for the same single piece of information and saying "normal service will be resumed once the snow has melted" would have been fine for the one random person looking to visit their aunt next June.
Failure conditions don't always mean that you site hasn't failed, it means that you've coped with that failure in a smart way.
Technorati Tags: SOA, Service Architecture