Cloud Computing: the commoditisation and associated automation in provisioning and management of execution contexts.Which isn't bad in 140 characters. One of the questions though was what level of resilience and redundancy does a cloud need to provide and associated with that what type of failover should it be delivering.
Dan has made a good case of cloud being about the provisioning API and I agree that is at the heart of cloud computing, the automated provisioning of virtual machines is the underpinnings of cloud. But what else? If its simply about that provisioning and management then my iPhone has a cracking application that enables me to start (provision) and stop Parallels instances on my Mac. It doesn't enable me to provision or deploy new instances (although in theory it could) but I'd argue that my little MacBook Pro would struggle to be described as a cloud environment.
So what is important in cloud? The first piece I'd add to the 140 characters would be failover. If my Mac fails then the images are toast, a cloud should have a level of virtualised redundancy which means that if a single piece of hardware fails my image should not. This means that running images must not only be portable but also must be actively failed over. Either by continually running across multiple hardware instances or by some form of active backup.
If a "cloud" solution is still tied to the "box fails, you fail" then for me that is a FAIL in terms of it really moving us on into a virtualised world, if my redundancy is not now virtualised then a major part of my SLA isn't virtualised and I'm still considering individual failover as important.
The follow on question is what level of redundancy is required. Can it all be in a single rack where a back-plane failure could take it all down? Can it be in two different parts of the same DC so a major power outage and UPS failure takes it down? Should it be in multiple DCs in a tied pair or triplet where a major natural disaster like an earthquake could toast it or must it be geographically redundant.
I'd say that it will truly be cloud when it achieves the later but that the bar needs to be a little bit lower so I'll say there are two classes of clouds and multiple levels
Automatically resilient clouds
These are clouds which provide true virtualisation of compute resources and effectively eliminate the need for hardware failure considerations. Virtual Machines can still fail for software reasons (hello Blue Screen of Death) but hardware and storage failure are managed to a given level.
- Platinum Clouds - Automatic geographical redundancy
- Gold Clouds - Automatic multi-DC redundancy
- Bronze Clouds - Automatic in DC redundancy
- Stone Clouds - Automatic proximity redundancy (shared hardware infrastructure)
- Lead Clouds - None of the above
These are solutions, often termed clouds, where the application can fail for hardware reasons and where you need to manage against hardware failures. Software failures also still occur clearly but the main point is that from a management perspective you have exactly the same redundancy concerns as with a physical server.
- Platinum Compute Grids - Ability to manage and provision your images across multiple Geos
- Gold Compute Grids - Ability to manage and provision your images across multiple DCs in a single geo
- Silver Compute Grids - Ability to manage and provision your images within a DC
- Lead Compute Grids - Everything goes to the same set of racks