Improving Performance in Flex and Scaling BlazeDS
I gave a talk today at work about Flex and BlazeDS, and in particular how to scale both to perform well during high-volume, real-time communication with a Java-based server (in the area of thousands of messages per second). Here are the most helpful bits from my presentation and the ensuing discussion:
Stick to StreamingAMF.
When it comes to real-time data transfer, StreamingAMF is really your only choice for a high-performance endpoint. It offers two simple advantages:
- Streaming connections allow for true push, rather than less-effective fast-polling.
- AMF is a binary protocol, so less data is transferred across the wire.
If you need a more thorough round-up of endpoints, I highly recommend DevGirl’s excellent endpoint explanation.
Batch messages going through BlazeDS to save bandwidth.
BlazeDS adds significant overhead to each message sent across the wire. With thousands of messages per second, this adds up to a very significant amount of bandwidth usage, to the point that performance will be adversely affected.
To compensate for this, buffer consecutive messages together and send several at once. Even a simple timeout that buffers message content for 10ms before sending it all as a single message will save an incredible amount of bandwidth, and a smart buffer that adjusts its timeout based on message activity will do even better.
This is easy to implement, and likely the biggest performance optimization available in a high-volume situation. Definitely worth doing if bandwidth and performance are a concern.
Override default BlazeDS connection settings.
BlazeDS sets two interesting connection limits too low:
First, the <max-streaming-clients> property is used to limit the number of clients that can simultaneously stream from the same server. BlazeDS limits this to 10 by default, so if 11 users connect to your application at the same time, that 11th one won’t get through. This is a serious fault, but we can raise the limit as long as we’re smart about how high we set it.
The reason there is a limit at all is that all connections in BlazeDS use blocking IO. This means that the maximum number of connections Blaze will support is limited by the maximum number of threads in the application container, since it will always require one thread per connection. Fortunately, modern containers support much more than 10 concurrent threads; Tomcat 6, for example, reserves 150 by default and even that can be boosted.
So the rule of thumb here is that you don’t want your connections in BlazeDS to outnumber the threads in your container, and that’s what you should base this limit on. If you require more than a few hundred concurrent connections, you’re out of luck and you’ll have to either wait until Blaze implements non-blocking IO connections, or upgrade to LiveCycle.
The second, much-less-serious configuration is the <max-streaming-connections-per-session> property, which is used to limit the number of concurrent streaming connections a specified browser can support. If this number is exceeded, new connections will not open on the same client. BlazeDS defaults this value to 1 for all browsers, so if a user opens two instances of your streaming app at the same time, on the same machine with the same browser, the second instance will not open.
This limit is dependent on the browser, and many do actually have a hard limit of one single streaming connection at a time. However, newer versions of Internet Explorer/Firefox/Safari, most versions of Opera, and all versions of Chrome support multiple concurrent streaming sessions. To take advantage of this, override the limit in your services config; I’d suggest looking at Patrick Heinzelmann’s awesome RemoteServices example to grab the specific browser numbers and a nifty code sample.
Get low-level with Flash Player.
There are probably a lot of neat Flash Player performance hacks that I’m not aware of, but here are two I’ve figured out so far:
The default frame rate in Flash Player is not very high. I’m not entirely clear on what it is; I’ve seen some sources say 24fps, some say 12fps, and some say it’s completely dynamic. Depending on what you’re doing, you may consider boosting it to draw more often. In particular, this can lower the worst-case latency between a message being triggered, processed and displayed. The flip-side here is that a high frame rate will raise your CPU usage, so that’s a trade-off to keep in mind.
Secondly, for the longest time I was updating the screen whenever I had new data to display. In retrospect, this was ridiculous. How do I know if I’m updating more often than the screen is being refreshed? How do I know how long it will be until this update is actually blitten* to the screen?
A much less naive approach is to make screen updates on the ENTER_FRAME event. This is dispatched right before Flash Player refreshes the display, so it ensures that whatever you do here will be instantly reflected on-screen. As long as this is the only place you’re doing screen updates, you know that you are doing exactly one update per screen refresh, which is ideal provided you’ve calibrated your refresh rate to match how often you receive updates.
The Profiler is your friend.
After all the above tricks have been exhausted, if you still need a performance boost, there’s always code-level optimization. Things like unrolling loops and minimizing allocation using the “new” keyword will add up eventually. A good way to find out which areas will benefit most from such refactoring is to use the profiler that comes with Flex Builder.
The profiler will allow you to view object allocation stacks and method performance. The former is a great way to find memory leaks and the latter is fantastic for finding out which methods are the slowest and thus most qualified for optimization. If you have any curiosity at all about this sort of thing (and you should!) I heartily encourage you to open up the profiler and fiddle around with it a bit; it takes a bit of ramp-up but once you have it figured out it’s a totally indispensable tool.
Hopefully this will be helpful to someone out there. Flex and BlazeDS are both very well documented by Adobe, but these were the handful of cases where I had to go well out of my way to find workable solutions.
* blitten: Past tense of blit.