Showing posts with label Google. Show all posts
Showing posts with label Google. Show all posts

Thursday, July 11, 2013

Google and Yahoo have it easy or why Hadoop is only part of the story

We hear lots and lots of hype at the moment around Hadoop, and it is a great technology approach, but there is also lots of talk about how this approach will win because Google and Yahoo are using it to manage their scale and thus this shows that their approach is going to win in traditional enterprises and other big data areas.

Lets be clear, I'm not saying Hadoop isn't a good answer for managing large amounts of information what I'm saying is that Hadoop is only part of the story and its arguably not the most important.  I'm also saying that Google and Yahoo have a really simple problem they are attempting to fix, in comparison with large scale enterprises and the industrial internet they've got it easy.  Sure they've got volume but what is the challenge?
  1. Gazillions of URIs and unstructured web pages
  2. Performant search
  3. Serving ads related to that search
I'm putting aside the gmails and google apps for a moment as those aren't part of this Hadoop piece, but I'd argue are, like Amazon, more appropriate reference points for enterprises looking at large scale.

So why do Google and Yahoo have it easy?

First off while its an unstructured data challenge this means that data quality isn't a challege they have to overcome.  If google serve you up a page when you search for 'Steve Jones' and you see the biology prof, sex pistols guitarist and Welsh model and you are looking for another Steve Jones you don't curse google because its the wrong person, you just start adding new terms to try and find the right one,  if Google slaps the wrong google+ profile on the results you just sigh and move on.  Google don't clear up the content.

Not worrying about data quality is just part of the not having to worry about master data and reference data challenge.  Google and Yahoo don't do any master data work or reference data work, they can't as their data sets are external.  This means they don't have to set up governance boards or operational process changes to take control of data, they don't need to get multiple stakeholders to agree on definitions and no regulator will call them to account if a search result isn't quite right.

So the first reason they have it easy is that they don't need to get people to agree.

The next reason is something that Google and Yahoo do know something about and that is performance, but here I'm not talking about search results I'm talking about transactions, the need to have a confirmed result.  Boring old things like atomic transactions and importantly the need to get back in a fast time.  Now clearly Google and Yahoo can do the speed part, but they have a wonderful advantage of not having to worry about the whole transactions stuff, sure they do email at a great scale and they can custom develop applications to within an inch of their life...  but that isn't the same as getting Siebel, SAP and old Baan system and three different SOA and EAI technologies working together.  Again there is the governance challenge and there is the 'not invented here' challenge that you can't ignore.  If SAP doesn't work the way you want... well you could waste time customising it but you are better off working to what SAP does instead.

The final reason that Google and Yahoo have it easy is talent and support.  Hadoop is great, but as I've said before companies have a Hadoop Hump problem and this is completely different to the talent engines at Google and Yahoo.  Both pride themselves on the talent they hire and that is great, but they also pay top whack and have interesting work to keep people engaged.  Enterprises just don't have that luxury, or more simply they just don't have the value to hire stellar developers and then also have those stellar developers work in support.  When you are continually tuning and improving apps like Google that makes sense, when you tend to deliver into production and hand over to a support team it makes much less sense.

So there are certainly things that enterprises can learn from Google and Yahoo but it isn't true to say that all enterprises will go that way, enterprises have different challenges and some of them are arguably significantly harder than system performance as they impact culture.  So Hadoop is great, its a good tool but just because Google and Yahoo use it doesn't mean enterprises will adopt it in the same way or indeed that the model taken with Google and Yahoo is appropriate.  We've already seen NoSQL become more SQL in the Hadoop world and we'll continue to see more and more shifts away from the 'pure' Hadoop Map Reduce vision as Enterprises leverage the economies of scale but do so to solve a different set of challenges and crucially a different culture and value network.

Google and Yahoo are green field companies, built from the ground up by IT folks.  They have it easy in comparison to the folks trying to marshall 20 business divisions each with their own sales and manufacturing folks and 40 ERPs and 100+ other systems badly connected around the world.



Thursday, March 22, 2012

Dear Google you've patented my published idea...

Okay so a while ago I had an idea, that people should blog about ideas they had and tag them as 'prior art' as a way to defeat patents.  Well today I read an article about Google patenting a location specific ad service which takes in local information (the weather) to give people targeted adverts. Well back in 2008 I had an idea, where I talked about temporal information including the phrase:
The final piece in the puzzle is then time. "Here" Services would need to know not only where "here" is but they would also need to know when "here" is. I don't want to be told about a show that finished yesterday or about one that is six months away. If its 11pm tell me about the Kebab shop, if its 9am tell me where to get Asprin.
Now in this I also talk about ad-servers and some sort of federated deployment model so arguably Google's great big central implementation is 'sufficiently different' to mount as a patent but I don't think so. However you can find lots of elements out there about location specific campaign management so the only 'difference' is that Google are talking about taking into account some environmental information to direct the advert.  This is something that retailers already do... ever noticed how they have more adverts for BBQs if its going to be sunny and more adverts for brollys if its going to rain?

So what is Google's patent really?  Well its the combination of temporal based (time/location) advertising with environmental information.  Its incredible that this passes a threshold of being original and not being something that anyone with decent experience could do.

I've had other ideas at other times but not been arsed to implement them.  Feel free to be the person who actually takes on the challenge.  The real point of this post however is to make the point that yet another patent has been granted that shouldn't be.  Its not about privacy moaning its actually about business and economic growth.  

What Google are patenting is what a corner shop keeper has been doing for as long as there have been corner shops.  Looking at who is on the street, looking at the weather and then picking their window display.  Re-writing that and putting a 'e' infront of it shouldn't be the bar that has to be cleared to get a patent, patents should be for genuine ideas that move us forwards and where the creator should be rewarded not for being the first person to work for a company with enough money to file a patent on the obvious.

Monday, July 04, 2011

Microsoft's Eastern Front: the iPad and mobility

For those who study European Wars the decision to invade Russia consistently stands as one of the dumbest that any individual can attempt. Not because Russia as an army was consistently brilliant or strong but because the Russian country is just too big and the winters too harsh to defeat via an invasion.

For years this has been the challenge of those taking on Microsoft, they've attacked the desktop market. Created products to compete with the profit factories that are Windows and Office, even giving them away in the case of Open Office, but the end result was the same... Microsoft remained the massively dominant player. Even when Linux looked like winning on Netbooks the shear size and power of the Microsoft marketplace ensured that there would be no desktop victories. Sure Apple has leveraged the iPod and iPhone to drive some more Mac sales but the dent has been minor.

From one perspective Microsoft has also been the biggest investor on another front, the front of mobile and mobility, billions upon billions have been poured into the various incarnations of Windows on Mobile devices, from Tablets and WindowsCE to the new Windows 7 Mobile it has consistently been a massive set of money for a very, very small slice of the pie. This disappointed people who invested in Microsoft but as long as the profit factories were safe then all was fine.

I think however that this failure is about to really hurt Microsoft. Today I'm sitting in a train carriage (treating myself by going First, on my own cost) and there are now 7 iPads open and 2 laptops (of which one is mine), I'm using my Laptop as I'm creating PPTs but if I wasn't I'd be on the iPad too.

The fact that I'm on a Mac is irrelevant, the key fact is that after Neil Ward-Dutton asked if the stats were good I took a walk down the carriages and found that a 3:1 iPad/slab to laptop continued through-out first class and dropped to 1:1 in standard class. So in the "best" case scenario you had 50% of people using and working on iPads (or equivalents) and in the management section is was at 75% iPad domination.

These people are emailing, browsing, creating documents and generally getting on with mobility working. That is a massive shift in 2 years. 2 years ago it would have been laptops out and people using 3G cards or working offline, now its all about mobility working. This represents a whole new attack on Microsoft's profit factories and one from a completely different direction than they are used to. With rumours saying that Windows 8 for slabs not being available until late 2012 or even early 2013 this means that a full desktop/laptop refresh cycle will have gone through before Microsoft can hope to start competing in this space.

I'm normally asked a couple of times on this 5 hour train journey about my ZAGGmate keyboard for iPad and where I got it from with people saying "that is really good, I could ditch my laptop with that". This concept of mobility extends to how you use things like email. Sure Outlook is a nice rich Email client, but the client on the iPad is pretty good and has the advantage that you don't have to VPN into a corporate environment but just use the mobile Exchange (an MS product) connection so mobile signal quality doesn't impact you as much. As an example, on this trip I've had to re-authenticate on VPN about 12 times, normally with the iPad I of course don't have to do it once.

Its hard to not feel that while MS has invested billions in eastern front of mobility that in reality its left with no actual defences, a Maginot Line if you will which has now been roundly avoided by a whole new set of technologies which are not competing with Microsoft in the way they expected.

How long can the profit factories be considered safe? With 1% of all browsing traffic already from the iPad and mobility being the new normal its a brave person who feels that another 12 or 18 months won't deliver long term damage to Microsoft's core profits.


Technorati Tags: ,

Why Google Apps plus Google+ would change the market

Okay I managed to get into Google+... so what did I find? Well first off I found something with an unusual view on privacy and security. I can send a message to a specific Circle and then anyone in that Circle can then share that information with anyone they want. So the ability for private information to go viral is absolutely straight there... this is something that needs to be changed for Circles to have any weight. Sure the Cut and Paste angle is liable to remain but that is quite different from the immediacy of sharing.

Secondly however I saw a massive opportunity of what Google could do if they combine Google+ with Google Apps, specifically the GAPE products for business. Companies like Yammer are building a nice business in enterprise collaboration. With a bit of focus on security then this is exactly what Google could do too... but better.

How?

Well first off there needs to be the idea of "administrated" Circles, i.e. Circles which are officially vetted and which people can request to join. This would allow not just the sort of FB fan pages to be created but more critically would allow companies to create internal project or information area circles to promote collaboration. I think administrated Circles would be a +ve on both the social and enterprise side. On the Social side I think there should be "closed" admin where a limited set of people can approve access and "open" admin where a group is established and people self-vet themselves in (and potentially out).

Secondly there needs to be the idea of Google+ restricted for a given domain, ala yammer, where everyone on it has to have a specific GAPE account. This means that a company can have a private Google+ environment, which when combined with administrated Circles would enable companies to set up collaborative environments rapidly and link it back to corporate directories and the collaborative technologies of GoogleApps, for instance a Circle could automatically be established for everyone who is editing or reviewing a document....

Thirdly, and this is where I think Google+ + GAPE would be a real killer, there should be "bridge" Circles between different GAPE domains. These are external collaboration circles where people can be added to the environment from multiple specific companies to provide a cross company collaboration. In a world where collaboration between enterprise partners is becoming key I think that this sort of integration between GAPE (which allows this collaboration on documents) and Google+ would provide a step-change in simplicity for inter-enterprise collaboration.

So there it is, three things that would give Google+ a paying audience for its technology in a place where FB and Twitter just have not been able, nor seem willing, to go. A large market where Google+ could be used as a wedge into GAPE and where Google's security and sharing vision could be put to brilliant use.

Personally I said that not bundling Orkut back in 2007 into GAPE was a mistake, now its time for Google to prove that decision was right because they've now got the technology to do much more than simple a corporate social network.

Now they've got the ability to create a fully collaborative company and drive inter-company collaboration.



Technorati Tags: ,

Monday, April 13, 2009

JDO makes a come back

Back in, I think, 2002 I went to JavaOne and the most over-subscribed sessions were on JDO, literally queues out of the room. Since then it sort of died a death and with the Hibernate/EJB 3.0 approach taking over the OO persistence layer surely the battle was lost.

Just kicking off a Google App Engine 4 Java (thank you Google, I can code like a human being again) session I note with some amusement that they've picked JDO as the persistence layer. It makes sense for what they do and the sort of store they provide but I still find it quite funny that I'm now dusting off those JDO pieces that back in 2002 I was sure would be the future but about which I was proven wrong.

Until today ;)

Technorati Tags: ,

Tuesday, November 04, 2008

MSN v Google.... does Microsoft own MSN?

Just updating a home PC to XP SP3 to install the wonderful Adobe CS4 so I did a search on IE on MSN.com for "XP SP3" and got....

Google of course redirect straight to the XP SP3 download.

Seriously if you can't get your own terms right how credible will people think you are?

The general point here is about credibility. Don't make up numbers that can be easily disproven and make sure that operational tests actually work.

Simply put undersell and over deliver.

Technorati Tags: ,

Saturday, October 18, 2008

Now we're caching on GAs

GAs = Google AppEngineS. Rubbish I know but what the hell. So I've just added in Memcache support for the map so now its much much quicker to get the map. Now the time taken to do these requests isn't very long at all (0.01 to do the mandelbrot, 0.001 to turn it into a PNG) but the server side caching will speed up the request and lead to it hitting the CPU less often...

Which means that we do still have the option of making the map squares bigger now as it won't hit the CPU as often, which means we won't violate the quota as often.

So in other words performance tuning for the cloud is often about combining different strategies rather than one strategy that works. Making the squares bigger blew out the CPU quota, but if we combine it with the cache then this could reduce the number of times that it blows the quota and thus enables it to continue. This still isn't effecting the page view quota however and that pesky ^R forces a refresh and the 302 redirect also makes sure that its still hitting the server, which is the root of the problem.

Technorati Tags: ,

Resize not helping

Well the resize to 100x100 isn't working because its now crapping out because I've got 200+ requests that all smash the CPU limit which blows another quota limit.

So basically you can bend the CPU limit, but if you send 200+ requests in a couple of seconds which all break the limit then that one kicks in as well.

Back to 54x54 and some extra planning.

Technorati Tags: ,

Google App Engine performance - Lots and Lots of threads

Just a quick one here. While Google App Engine's Python implementation limits you to a single thread it certainly isn't running a single thread and servicing requests from it. When running locally (where the performance per image is about the same) it certainly does appear to be single threaded as it takes an absolute age and logs are always one request after another. On the server however its a completely different case with multiple requests being served at the same time. This is what the Quota Breaking graphs seem to indicate as its servicing 90 requests a second which would seem to indicate that the App Engine just starts a new thread (or pulls from a thread pool) for each new request and is spreading theses requests across multiple CPUs. The reason I say that later piece is that the performance per image is pretty much the same all the time which indicates each one is getting dedicated time.

So lots of independent threads running on lots of different CPUs but probably sharing the same memory and storage space.

Technorati Tags: ,

Google App Engine, not for Web 2.0?

Redirect doesn't help on quota but something else does...



The block size has increased from 54x54 to 100x100 which still fits within the quota at the normal zoom (but is liable to break quota a bit as we zoom in). This moves the number of requests per image down from 625 to 225 which is a decent drop. Of course with the redirect we are at 450 but hopefully we'll be able to get that down with some more strategies.

The point here is that when you are looking to scale against quotas it is important to look at various things not simply the HTTP related elements. If you have a page view quota the easiest thing to do is shift bigger chunks less often.

One point that this does mean however is that Google App Engine isn't overly suited to Web 2.0 applications. It likes big pages rather than having a sexy Web 2.0 interface with lots and lots of requests back to the server. GMail for instance wouldn't be very good on App Engine as its interface is always going back to the server to get new adverts and checking for new emails.

So when looking at what sort of cloud works for you, do think about what sort of application you want to do. If you are doing lots of Web 2.0 small style AJAX requests then you are liable to come a cropper against the page view limit a lot earlier than you thought.

Technorati Tags: ,

Redirecting for caching - still not helping on quota

As I said, redirect isn't the solution to the problem but I thought I'd implement it anyway, after all when I do fix the problem its effectively a low cost option anyway.



What this does is shifts via a redirect (using 302 rather than 301 as I might decide on something else in future and let people render what ever they want) to the nearest "valid" box. Valid here is considered to be a box of size (width and height) of a power of 2 and based around a grid with a box starting at 0,0. So effectively we find the nearest power of 2 to the width then just move down from the current point to find the nearest one. Not exactly rocket science and its effectively doubling the number of hits.

Technorati Tags: ,

Friday, October 17, 2008

Google App Engine performance under heavy load - Part 4

Okay so the first time around the performance under relatively light load was pretty stable. Now given that we are at the Quota Denial Coral would this impact the performance?


First off look at the scale, its not as bad as it looks. We are still talking normally about a 10% range from Min to Max and even the worst case is 16% which is hardly a major issue. But is this a surprise?

No it isn't. The reason is that because of the quota that we are breaking (page views) we aren't actually stressing the CPU quota as much (although using several milli-seconds a second indicates that we are hitting it quite hard). That said however it is still pretty impressive that in a period of time where we are servicing around 90 requests a second the load behaviour is exactly the same, or arguably more stable as the min/max gap is more centred around the average, than when it is significantly lower.

So again the stability of performance of App Engine is looking pretty good independent of the overall load on the engine.

Technorati Tags: ,

Thursday, October 16, 2008

HTTP Cache and the need for cachability

One of the often cited advantages of REST implemented in HTTP is the easy access to caching which can improve performance and reduce the load on the servers. Now with breaking app engine quota regularly around Mandel Map the obvious solution is to turn on caching. Which I just have by adding the line

self.response.headers['Cache-Control'] = 'max-age=2592000'


Which basically means "don't come and ask me again for a week". Now part of the problem is that hitting reload in a browser forces it to go back to a server anyway but there is a second and more important problem as you mess around with the Map. With the map, double click and zoom in... then hold shift and zoom out again.



Notice how it still re-draws when it zooms back out again? The reason for this is that the zoom in calculation just works around a given point and sets a new bottom left of the overall grid relative to that point. This means that every zoom in and out is pretty much unique (you've got a 1 in 2916 chance of getting back to the cached zoom out version after you have zoomed in).

So while the next time you see the map it will appear much quicker this doesn't actually help you in terms of it working quicker as it zooms in and out or in terms of reducing the server load for people who are mucking about with the Map on a regular basis. The challenge therefore is designing the application for cachability rather than just turning on HTTP Caching and expecting everything to magically work better.

The same principle applies when turning on server side caching (like memcache in Google App Engine). If every users gets a unique set of results then the caching will just burn memory rather than giving you better performance, indeed the performance will get slower as you will have a massively populated cache but have practically no successful hits from requests.

With this application it means that rather than simply do a basic calculation that forms the basis for the zoom it needs to do a calculation that forms a repeatable basis for the zoom. Effectively those 54x54 blocks need to be the same 54x54 blocks at a given zoom level for every request. This will make the "click" a bit less accurate (its not spot on now anyway) but will lead to an application which is much more effectively cachable than the current solution.

So HTTP Cache on its own doesn't make your application perform any better for end users or reduce the load on your servers. You have to design your application so the elements being returned are cachable in a way that will deliver performance improvements. For some applications its trivial, for others (like the Mandelbrot Map) its a little bit harder.


Technorati Tags: ,

Wednesday, October 15, 2008

Google App Engine - Quota breaking on a normal day

Okay after yesterday's quota smashing efforts I turned off the load testing and just let the normal website load go but with the "standard image" that I use for load profiling being requested every 30 seconds. That gives a load of less than 3000 requests in addition to the Mandel Map and gadget requests.

So its pretty clear that requests are WAY down from yesterday at a peak of under 8 requests a second which is down from a sustained load of around 90 requests a second. So how did this impact the quota? Well it appears that once you break the quota that you are going to get caught more often, almost like you get onto a watch list.
Interestingly though you'll note that again the denials don't match straight to demand. There is a whole period of requests where we have no denials and then it kicks in. This indicates that thresholds are being set for periods of time which are fixed rather than rolling, i.e. you have a 24 hour block that is measured and that then sets the quota for the next 24 hour block rather than it being a rolling 24 hour period (where we'd expect to see continual denails against a constant load).
Megacycles are again high on the demand graph but non-existant on the quota graph and the denails don't correspond directly to the highest CPU demand periods. So it does appear (to me) that the CPU piece isn't the issue here (even though its highlighting the number of 1.3x quota requests (that standard image)) but more testing will confirm that.

The last test was to determine whether the data measurement was actually working or not. Again we see the demand graph showing lots of data going forwards and backwards with nearly 4k a second being passed at peak. It takes about 3 Mandel Map requests to generate 1MB of data traffic so it certainly appears that right now Google aren't totting up on the Bandwidth or CPU fronts, its about the easy metrics of page requests and actual time. They are certainly capturing the information (that is what the demand graphs are) but they aren't tracking it as a moving total right now.

Next up I'll look at the performance of that standard image request to see if it fluctuates beyond its 350 - 400 milliseconds normal behaviour. But while I'm doing that I'll lob in some more requests by embedding another Mandel Map



Technorati Tags: ,

Google App Engine - breaking quota in a big way

Okay so yesterday's test was easy. Set four browsers running with the Reload plug-in set for every 10 seconds (with one browser set to 5 seconds). This meant that there would be 18,780 hits a minute. Now there are a bunch of quotas on Google App Engine and as I've noticed before its pretty much only the raw time one
that gets culled on an individual request.

So it scales for the cloud in terms of breaking down the problem but now we are running up against another quota. The 5,000,000 page views a month. This sounds like a lot, and it would be if each page was 1 request, but in this AJAX and Web 2.0 each page can be made of lots of small requests (625 images + the main page for starters). Now Google say that they throttle before your limit rather than just going up to it and stopping... and indeed they do.

That shows the requests coming in. Notice the two big troughs? That could be the test machines bandwidth dropping out for a second, or an issue on the App Engine side. More investigation required. That profile of usage soon hit the throttle

This looks like you can take a slashdotting for about 2 hours before the throttle limit kicks in. The throttle is also quickly released when the demand goes down. The issue however here is that it isn't clear how close I am to a quota and how much I have left, there isn't a monthly page count view and, as noted before, the bandwidth and cycles quotas don't appear to work at the moment
It still says I've used 0% of my CPU and bandwidth which is a little bit odd given this really does cane the CPU. Bug fixing required there I think!

So basically so far it appears that App Engine is running on two real quotas, one is the real time that a request takes and the other is related to the number of page views. If you are looking to scale on the cloud it is important to understand the metrics you need to really measure and those which are more informational. As Google become tighter on bandwidth and CPU then those will become real metrics but for now its all about the number of requests and the time those requests take.

Technorati Tags: ,

Monday, October 13, 2008

Google App Engine - How many megas in a giga cycle?

Well the Mandel Map testing is well underway to stress out the Google cloud. 12 hours in and there is something a bit odd going on....



Notice that the peak says that it was doing 4328 megacycles a second, and its generally been doing quite a bit with the 1000 megacycle a second barrier being broached on several occasions.

Now for the odd bit. According to the quota bit at the bottom I've used up 0.00 Gigacycles of my quota. Now the data one looks a little strange as well as it is firing back a lot of images and its not registering at all. So despite all of that load I've apparently not made a dent in the Google Apps measurements for CPU cycles or for data transfer. To my simple mind that one peak of 4328 megacycles should be around 4 Gigacycles however you do the maths. It really does seem to be a staggering amount of CPU and bandwidth that is available if all of this usage doesn't even make it to the 2nd decimal place of significance.

So here it is again to see if this helps rack up the numbers!



Technorati Tags: ,

Sunday, October 12, 2008

Killing a cloud - MandelMap

One of the things talked about with cloud computing is "horizontal scalability" which basically means being able to do lots of small things that are independent of each other. Now some tasks break down horizontally pretty well, others are a bit more of a chore but the challenge is normally how to re-combine those pieces into the overall set of results.

Horizontal scaling in the cloud also requires you to know when something is reaching its peak so you can add new servers. This could be done by the cloud provider, namely they see you are running out of space and they spin up another instance and ensure that it all runs across the same shared data store. This is fine, unless you are paying for those new instances and suddenly find yourself a few thousand dollars a minute down as a result of some coding problems.

Now the Google cloud forces you down this path even more as the last set of Performance metrics showed. Its focus is really short lived transactions which gave a big issue for the Mandelbrot set which is very computationally intensive. Even the little gadget on the right hand side is running at 1.3x quota.

So the solution is then to break the problem down further. Fortunately the Mandelbrot set is perfect for that as each individual point has a known value. So you can break the problem down as far as you want. I went for a series of 54x54 (the gadget is 200x200 so its about 1/13th the number of calculations) squares to create a bigger image. Then I got a little bit silly, thanks to GWT and decided to build a "Mandel Map" in other words use some of the infinite scrolling bits of Google Maps but applied to a Mandelbrot set.



Double click to zoom, shift and double click to zoom out. Drag to scroll around the map. Now the advantage of using the small bits becomes apparent, you can zoom in for miles and still remain within the quota set by Google but the actual effect is that you have a bigger image being displayed. Effectively here the browser is being used as a "late integration" approach to recombine the information at the point of display and act as the point of division for the problem into its smaller chunks. Now this is possible in part thanks to clearly parameterised URIs which can be used by clients to breakdown the problem. This puts a degree of control into the client but has the knock-on effect of more tightly coupling the two together.

Part of the reason for this is to see how Google copes with having a real horizontal hammering (the grid is 5x5x(5x5) so 625 requests for the basic image) and to provide a set of base data for the next challenge which is around how different types of caching can impact performance and how you actually need to design things with caching in mind. The test here is to see how well the standard load (that gadget at the side) copes while people are playing around with the map. This gives us an idea of what the peak scaling of a given image is and whether Google are doing something for us behind the scenes or not.

Yes its not perfect and it could be done in smaller chunks (etc etc) and maybe if I can be bothered I'll tune it. But its just to demonstrate a principle and took a whole couple of hours, if you want the full screen version, well here is MandelMap.

Technorati Tags: ,

Monday, June 30, 2008

Google App Engine performance - Part 3

Okay so the last piece was just when does it cut off in pure times perspective?

Shows the various daily peaks and having done a bit more detailed testing the longest has been 9.27 seconds with several going above the 9 seconds mark. These are all massively over the CPU limit but it appears that the only real element that gets culled is the raw time. Doing some more work around the database code at the moment and it appears that long queries there are also a pretty big issue, especially when using an iterator. The best bet is to do a fetch on the results and then use those results to form the next query rather than moving along the offsets, in other words if you are ordering by date (newest to oldest) then do a fetch(20) and take the time of the last element in the results and on the next query say "date>last.date". Fetch is certainly your friend in these scenarios.

So what does this mean? Well Google aren't culling at the CPU limit straight away but are consistent around the time limit, the performance doesn't have peaks and troughs through the day and there doesn't seem to be any swapping out of CPU intensive tasks. All in all its a solid base.

Finally however I just had to lob on something from Google Spreadsheets that brings how the sort of thing that people can do when they have access to real time data and decent programming frameworks.

This just shows the progression of the "default" render over time, if you go by "average" then it will show you the stability that occurs and if you go by count then it shows the amount of calculations that all of these stats have been gleaned form and will help you think whether its a decent enough set of data to draw conclusions from.


Technorati Tags: ,

Wednesday, June 25, 2008

Google App Engine performance - Part 2

So the first analysis was to look at the gadget performance with 40,000 pixels which gives a fair old number of calculations (its 16 iterations for those that want to know). My next consideration was what would happen to a larger image that was further over the threshold. Would that see more issues?



Again its that cliff (I need to go and look at the code history) but again its remarkably stable after that point.


I know I shouldn't be surprised but this is several times over the CPU Quota limit (about 5 times in fact) so I was expecting to see a bit more variation as it caned the processor.



Now this shows just how consistent the processing is. Its important to note here that this isn't what Google App Engine is being pitched at right now. Given they've pitched it at data read intensive apps I'm impressed at just how level the capacity is. Having a 2 x standard deviation that is sitting around +/- 2% and even the "exceptional" items only bumping up around the 5% mark is indication either of a load of spare capacity and therefore not much contention or some very clever loading.

The penultimate bit I wanted to see was whether the four fold increase in calculations resulted in a linear increase in time.



What this graph shows is the raw performance and then the weighting (i.e. blog performance divided by 4). Zooming in on comparing the blog (160000 pixel) weighted against the straight off 40000 pixel gadget we get

Which very impressively means that there is a slight (that really is a fraction of 1% not 0.5 = 50%) performance gain through doing more calculations. Its not enough to be significant but it is enough to say that the performance is pretty linear even several times above the performance quota. The standard deviations are also pretty much in line which indicates a decent amount of stability at this level.

So while this isn't a linear scalability test in terms of horizontal scaling it does indicate that you are pretty much bound to 1 CPU and I'm not seeing much in the way of swapping out (you'd expect the Max/Min and the std dev on the larger one to be higher if swapping was a problem). So either Google have a massive bit of spare capacity or are doing clever scheduling with what they have.

The final question is what is the cut off point....

Technorati Tags: ,

Google App Engine performance - Part 1

Okay its been a few weeks now and I thought I'd take a look at the performance and how spiky it was. The highest hit element was the gadget with a few thousand hits so I took the data from the "default" image (-2,-1), (1,1) and looked at the performance over time.



Are all of the results over time. I'm not quite sure if that cliff is a code optimisation or is Google doing a step-fold increase but the noticeable element is that after the spike its pretty much level. Delving into the results further the question was whether there was a notable peak demand time on the platform which impacted the application.



The answer here looks like its all over the place on the hours, but the reality is that its the first few days that are the issue and in reality the performance is within a fairly impressive tolerance

Beyond the cliff therefore the standard deviation and indeed the entire range is remarkably constrained within about +/- 5% for min/max of average and two standard devs being +/- 2% of the average. These performances are the ones that within 1.2 times of the Google App Engine Quota Limit in terms of CPU so you wouldn't expect to see much throttling. The next question therefore is what happens under higher load...

Technorati Tags: ,