Hi All
This isn’t games related, but it is performance related, so, oh well… ;D
One of the services my company provides is a WAP-based chat service. We’re rapidly approaching 1 million page views a day (not there yet, but getting closer every day) just in NZ & Australia alone, with Canada probably going to be added in a few months, and perhaps the US shortly after. While our database & app server boxes aren’t even breaking a sweat at the moment (never much more than 15-20% of CPU utilised even during peak), I’m constantly looking at various ways to improve performance (clustering technologies and so on), in case we suddenly get a huge increase in load.
The last couple of days I’ve been looking at IBM & BEA’s virtual machine offerings, but getting some weird/mixed results. I’ve got a JMeter test which covers the full range of functions in our chat service. On an admittedly crappy test server with 512MB of RAM, using Sun’s VM, the test maxes out at about 85-88 requests (page views) per second for about 28000 requests, then up to between 95-100 per second on the second run of the test.
With Bea’s JRockit, the test hits about 160-170 requests per second up to around 12000 requests, and then the VM has a massive pause (every thread stops) for about 20 seconds (the CPU goes completely idle during this period). It eventually recovers, runs again for a while, then pauses again. The final result is only about 96 requests per second because of those pauses.
IBM’s VM exhibits a similar pattern, although the pauses aren’t as long, nor is the peak request rate as high. Final result is about the same though.
Has anyone else experienced similar effects? Or any thoughts as to the cause? I’ve run JRockit in debug mode and it seems to be running code generation during those pauses, but as to why that would kill the entire VM for 20 seconds, I’ve only got guesses.
Couple of final notes:
- on the second run of the test JRockit peaks at around 212 requests per second, but eventually tails off again because of the lengthy pauses.
- I’m running all VMs with whatever GC model I think will result in the lowest GC pauses (Sun=ConcMarkSweepGC, JRockit=gencon, IBM=optavgpause).
- technology wise (if it lends any clues) we’re running Apache+mod_jk connecting to Jetty (using velocity + half an appserver to produce the content), and the database is postgres.
If anyone has any thoughts, I’d appreciate the input.
Thanks,
J