Showing posts with label App Engine. Show all posts
Showing posts with label App Engine. Show all posts

Monday, April 13, 2009

JDO makes a come back

Back in, I think, 2002 I went to JavaOne and the most over-subscribed sessions were on JDO, literally queues out of the room. Since then it sort of died a death and with the Hibernate/EJB 3.0 approach taking over the OO persistence layer surely the battle was lost.

Just kicking off a Google App Engine 4 Java (thank you Google, I can code like a human being again) session I note with some amusement that they've picked JDO as the persistence layer. It makes sense for what they do and the sort of store they provide but I still find it quite funny that I'm now dusting off those JDO pieces that back in 2002 I was sure would be the future but about which I was proven wrong.

Until today ;)

Technorati Tags: ,

Saturday, October 18, 2008

Now we're caching on GAs

GAs = Google AppEngineS. Rubbish I know but what the hell. So I've just added in Memcache support for the map so now its much much quicker to get the map. Now the time taken to do these requests isn't very long at all (0.01 to do the mandelbrot, 0.001 to turn it into a PNG) but the server side caching will speed up the request and lead to it hitting the CPU less often...

Which means that we do still have the option of making the map squares bigger now as it won't hit the CPU as often, which means we won't violate the quota as often.

So in other words performance tuning for the cloud is often about combining different strategies rather than one strategy that works. Making the squares bigger blew out the CPU quota, but if we combine it with the cache then this could reduce the number of times that it blows the quota and thus enables it to continue. This still isn't effecting the page view quota however and that pesky ^R forces a refresh and the 302 redirect also makes sure that its still hitting the server, which is the root of the problem.

Technorati Tags: ,

Resize not helping

Well the resize to 100x100 isn't working because its now crapping out because I've got 200+ requests that all smash the CPU limit which blows another quota limit.

So basically you can bend the CPU limit, but if you send 200+ requests in a couple of seconds which all break the limit then that one kicks in as well.

Back to 54x54 and some extra planning.

Technorati Tags: ,

Google App Engine performance - Lots and Lots of threads

Just a quick one here. While Google App Engine's Python implementation limits you to a single thread it certainly isn't running a single thread and servicing requests from it. When running locally (where the performance per image is about the same) it certainly does appear to be single threaded as it takes an absolute age and logs are always one request after another. On the server however its a completely different case with multiple requests being served at the same time. This is what the Quota Breaking graphs seem to indicate as its servicing 90 requests a second which would seem to indicate that the App Engine just starts a new thread (or pulls from a thread pool) for each new request and is spreading theses requests across multiple CPUs. The reason I say that later piece is that the performance per image is pretty much the same all the time which indicates each one is getting dedicated time.

So lots of independent threads running on lots of different CPUs but probably sharing the same memory and storage space.

Technorati Tags: ,

Google App Engine, not for Web 2.0?

Redirect doesn't help on quota but something else does...



The block size has increased from 54x54 to 100x100 which still fits within the quota at the normal zoom (but is liable to break quota a bit as we zoom in). This moves the number of requests per image down from 625 to 225 which is a decent drop. Of course with the redirect we are at 450 but hopefully we'll be able to get that down with some more strategies.

The point here is that when you are looking to scale against quotas it is important to look at various things not simply the HTTP related elements. If you have a page view quota the easiest thing to do is shift bigger chunks less often.

One point that this does mean however is that Google App Engine isn't overly suited to Web 2.0 applications. It likes big pages rather than having a sexy Web 2.0 interface with lots and lots of requests back to the server. GMail for instance wouldn't be very good on App Engine as its interface is always going back to the server to get new adverts and checking for new emails.

So when looking at what sort of cloud works for you, do think about what sort of application you want to do. If you are doing lots of Web 2.0 small style AJAX requests then you are liable to come a cropper against the page view limit a lot earlier than you thought.

Technorati Tags: ,

Redirecting for caching - still not helping on quota

As I said, redirect isn't the solution to the problem but I thought I'd implement it anyway, after all when I do fix the problem its effectively a low cost option anyway.



What this does is shifts via a redirect (using 302 rather than 301 as I might decide on something else in future and let people render what ever they want) to the nearest "valid" box. Valid here is considered to be a box of size (width and height) of a power of 2 and based around a grid with a box starting at 0,0. So effectively we find the nearest power of 2 to the width then just move down from the current point to find the nearest one. Not exactly rocket science and its effectively doubling the number of hits.

Technorati Tags: ,

Friday, October 17, 2008

Caching strategies when redirect doesn't work

So the first challenge of cachability is solving the square problem. Simply put the created squares need to be repeatable when you zoom in and out.

So the calculation today is just find the click point and find the point halfway between that and the bottom left to give you the new bottom left.

The problem is that this is unique. So what we need to find is the right bottom left that is nearest to the new bottom left.


Now one way to do this would be to do a 301 redirect for all requests to the "right" position. This is a perfectly valid way of getting people to use the right resource and of limiting the total space of the resource set. What you are saying in effect is that a request for resource X is in fact a request for resource Y and you should look at the new place to get it. This works fine in this scenario but for one minor problem.

The challenge we have is page views and a 301 redirect counts as a page view, meaning that we'd be doubling the number of page views required to get to a given resource. Valid though this strategy is therefore it isn't the one that is going to work for this cloud application. We need something that will minimise the page views.

But as this is a test.... lets do it anyway!

Technorati Tags: ,

Google App Engine performance under heavy load - Part 4

Okay so the first time around the performance under relatively light load was pretty stable. Now given that we are at the Quota Denial Coral would this impact the performance?


First off look at the scale, its not as bad as it looks. We are still talking normally about a 10% range from Min to Max and even the worst case is 16% which is hardly a major issue. But is this a surprise?

No it isn't. The reason is that because of the quota that we are breaking (page views) we aren't actually stressing the CPU quota as much (although using several milli-seconds a second indicates that we are hitting it quite hard). That said however it is still pretty impressive that in a period of time where we are servicing around 90 requests a second the load behaviour is exactly the same, or arguably more stable as the min/max gap is more centred around the average, than when it is significantly lower.

So again the stability of performance of App Engine is looking pretty good independent of the overall load on the engine.

Technorati Tags: ,

Thursday, October 16, 2008

HTTP Cache and the need for cachability

One of the often cited advantages of REST implemented in HTTP is the easy access to caching which can improve performance and reduce the load on the servers. Now with breaking app engine quota regularly around Mandel Map the obvious solution is to turn on caching. Which I just have by adding the line

self.response.headers['Cache-Control'] = 'max-age=2592000'


Which basically means "don't come and ask me again for a week". Now part of the problem is that hitting reload in a browser forces it to go back to a server anyway but there is a second and more important problem as you mess around with the Map. With the map, double click and zoom in... then hold shift and zoom out again.



Notice how it still re-draws when it zooms back out again? The reason for this is that the zoom in calculation just works around a given point and sets a new bottom left of the overall grid relative to that point. This means that every zoom in and out is pretty much unique (you've got a 1 in 2916 chance of getting back to the cached zoom out version after you have zoomed in).

So while the next time you see the map it will appear much quicker this doesn't actually help you in terms of it working quicker as it zooms in and out or in terms of reducing the server load for people who are mucking about with the Map on a regular basis. The challenge therefore is designing the application for cachability rather than just turning on HTTP Caching and expecting everything to magically work better.

The same principle applies when turning on server side caching (like memcache in Google App Engine). If every users gets a unique set of results then the caching will just burn memory rather than giving you better performance, indeed the performance will get slower as you will have a massively populated cache but have practically no successful hits from requests.

With this application it means that rather than simply do a basic calculation that forms the basis for the zoom it needs to do a calculation that forms a repeatable basis for the zoom. Effectively those 54x54 blocks need to be the same 54x54 blocks at a given zoom level for every request. This will make the "click" a bit less accurate (its not spot on now anyway) but will lead to an application which is much more effectively cachable than the current solution.

So HTTP Cache on its own doesn't make your application perform any better for end users or reduce the load on your servers. You have to design your application so the elements being returned are cachable in a way that will deliver performance improvements. For some applications its trivial, for others (like the Mandelbrot Map) its a little bit harder.


Technorati Tags: ,

Wednesday, October 15, 2008

Google App Engine - Quota breaking on a normal day

Okay after yesterday's quota smashing efforts I turned off the load testing and just let the normal website load go but with the "standard image" that I use for load profiling being requested every 30 seconds. That gives a load of less than 3000 requests in addition to the Mandel Map and gadget requests.

So its pretty clear that requests are WAY down from yesterday at a peak of under 8 requests a second which is down from a sustained load of around 90 requests a second. So how did this impact the quota? Well it appears that once you break the quota that you are going to get caught more often, almost like you get onto a watch list.
Interestingly though you'll note that again the denials don't match straight to demand. There is a whole period of requests where we have no denials and then it kicks in. This indicates that thresholds are being set for periods of time which are fixed rather than rolling, i.e. you have a 24 hour block that is measured and that then sets the quota for the next 24 hour block rather than it being a rolling 24 hour period (where we'd expect to see continual denails against a constant load).
Megacycles are again high on the demand graph but non-existant on the quota graph and the denails don't correspond directly to the highest CPU demand periods. So it does appear (to me) that the CPU piece isn't the issue here (even though its highlighting the number of 1.3x quota requests (that standard image)) but more testing will confirm that.

The last test was to determine whether the data measurement was actually working or not. Again we see the demand graph showing lots of data going forwards and backwards with nearly 4k a second being passed at peak. It takes about 3 Mandel Map requests to generate 1MB of data traffic so it certainly appears that right now Google aren't totting up on the Bandwidth or CPU fronts, its about the easy metrics of page requests and actual time. They are certainly capturing the information (that is what the demand graphs are) but they aren't tracking it as a moving total right now.

Next up I'll look at the performance of that standard image request to see if it fluctuates beyond its 350 - 400 milliseconds normal behaviour. But while I'm doing that I'll lob in some more requests by embedding another Mandel Map



Technorati Tags: ,

Google App Engine - breaking quota in a big way

Okay so yesterday's test was easy. Set four browsers running with the Reload plug-in set for every 10 seconds (with one browser set to 5 seconds). This meant that there would be 18,780 hits a minute. Now there are a bunch of quotas on Google App Engine and as I've noticed before its pretty much only the raw time one
that gets culled on an individual request.

So it scales for the cloud in terms of breaking down the problem but now we are running up against another quota. The 5,000,000 page views a month. This sounds like a lot, and it would be if each page was 1 request, but in this AJAX and Web 2.0 each page can be made of lots of small requests (625 images + the main page for starters). Now Google say that they throttle before your limit rather than just going up to it and stopping... and indeed they do.

That shows the requests coming in. Notice the two big troughs? That could be the test machines bandwidth dropping out for a second, or an issue on the App Engine side. More investigation required. That profile of usage soon hit the throttle

This looks like you can take a slashdotting for about 2 hours before the throttle limit kicks in. The throttle is also quickly released when the demand goes down. The issue however here is that it isn't clear how close I am to a quota and how much I have left, there isn't a monthly page count view and, as noted before, the bandwidth and cycles quotas don't appear to work at the moment
It still says I've used 0% of my CPU and bandwidth which is a little bit odd given this really does cane the CPU. Bug fixing required there I think!

So basically so far it appears that App Engine is running on two real quotas, one is the real time that a request takes and the other is related to the number of page views. If you are looking to scale on the cloud it is important to understand the metrics you need to really measure and those which are more informational. As Google become tighter on bandwidth and CPU then those will become real metrics but for now its all about the number of requests and the time those requests take.

Technorati Tags: ,

Monday, October 13, 2008

Google App Engine - How many megas in a giga cycle?

Well the Mandel Map testing is well underway to stress out the Google cloud. 12 hours in and there is something a bit odd going on....



Notice that the peak says that it was doing 4328 megacycles a second, and its generally been doing quite a bit with the 1000 megacycle a second barrier being broached on several occasions.

Now for the odd bit. According to the quota bit at the bottom I've used up 0.00 Gigacycles of my quota. Now the data one looks a little strange as well as it is firing back a lot of images and its not registering at all. So despite all of that load I've apparently not made a dent in the Google Apps measurements for CPU cycles or for data transfer. To my simple mind that one peak of 4328 megacycles should be around 4 Gigacycles however you do the maths. It really does seem to be a staggering amount of CPU and bandwidth that is available if all of this usage doesn't even make it to the 2nd decimal place of significance.

So here it is again to see if this helps rack up the numbers!



Technorati Tags: ,