Why Siege isn’t an accurate test tool for Magento performance

We’re getting pretty concerned at Sonassi HQ with the growing confusion surrounding transactions per second (TPS), requests per second (RPS) and concurrency by the community as a whole. Shamefully, I fear that we are guilty for bucking this trend.

We created a monster, now its time to put it down

Magento Monster3 years ago, when we really started to get engaged in web hosting, we launched a little project called Mage Benchmark. The purpose was to have a portal where specialist Magento web hosts could put a brief description about themselves and volunteer a URL for a self-hosted demo store – that could be externally tested to give a viewpoint of uptime, page load time and concurrency support. This attracted both acclaim and sadly, criticism, amongst the community.

Early on, before Mage Benchmark, Sonassi and Pro Contractors (two high performance, specialist Magento web hosts) had spurred a trend of performance testing using Siege. It served a purpose, for quick and easy testing of server-side changes and showing performance differences for high levels of concurrency. So when you are testing your own servers, its fairly useful – but its not without its faults.

That is why we launched Mage Benchmark, so that we could test not only pure PHP throughput via a “Siege-like” test, but also complete full page downloads and replicate the user experience. Originally, it was built on a custom shell application, but now we use Apache jMeter to carry this testing out.

Siege is great

Siege really is a great tool, but what is isn’t is a tool that is designed to test remote locations, at high levels of concurrency, with any accuracy. In fact, it has a number of shortfalls when used in this capacity.

  1. Siege isn’t representative of what a real-user (or multiple users) would actually be doing on your website. It can only load the raw response code and HTML, not all the other elements within a page (images, CSS, JS or other static content) – so effectively, it only tests PHP performance.

    It also has very limited session/cookie support, no support for pipelining and basic support for HTTP/1.1. The load it is generating is nothing like that of a real-user, so whilst its good for a quick reference after changes; it doesn’t really indicate that anything will change for a user in real life.

  2. Siege is easily fooled, it can’t differentiate between a static file being served (ie. a pure HTML file) or a dynamic file (ie. a dynamic Magento PHP page). So if you are running any kind of static file proxy, the results are immediately skewed. At this point – you’ll only be testing the caching proxy, not the delivery speed behind it.

    So those using Varnish, Nginx caching, mod_pagecache can easily just buffer the page into a cache and you’ll see sub 20ms render times. If your using Varnish, then using Siege to test performance – you might as well be loading an image rather than category URL, as it’ll give the exact same results.

  3. Testing remote servers is almost pointless as it is a concurrency test (ie. how many requests can be satisfied repeatedly), the immediate bottleneck is the network connection between the two machines. Latency and TCP/IP overheads are what make testing a remote site completely pointless, the slightest network congestion amongst a peer between the two servers will immediately show reduced performance. So, what really starts to come into play is how fast the TCP 3-way handshake can be completed – the server being tested could be serving a dynamic page or static 0 byte file – and you could see exactly the same rates of performance, as connectivity is the bottleneck.

    We can show this using a simple ping. Our data-centres are located in Manchester, United Kingdom, so we’ll try pinging a server in the UK, then a server in the USA and show the differentiation. Both servers are connected to the internet via 100Mbit connections.

    Ping from UK to UK

    [~]$ ping www.bytemark.co.uk -c4
    PING www.bytemark.co.uk (212.110.161.177) 56(84) bytes of data.
    64 bytes from extapp-front.bytemark.co.uk (212.110.161.177): icmp_seq=1 ttl=57 time=2.86 ms
    64 bytes from extapp-front.bytemark.co.uk (212.110.161.177): icmp_seq=2 ttl=57 time=2.51 ms
    64 bytes from extapp-front.bytemark.co.uk (212.110.161.177): icmp_seq=3 ttl=57 time=2.54 ms
    64 bytes from extapp-front.bytemark.co.uk (212.110.161.177): icmp_seq=4 ttl=57 time=2.63 ms
    --- www.bytemark.co.uk ping statistics ---
    4 packets transmitted, 4 received, 0% packet loss, time 3005ms
    rtt min/avg/max/mdev = 2.515/2.641/2.869/0.142 ms
    

    Ping from UK to USA

    [~]$ ping www.mediatemple.net -c 4
    PING www.mediatemple.net (64.207.129.182) 56(84) bytes of data.
    64 bytes from mediatemple.net (64.207.129.182): icmp_seq=1 ttl=49 time=158 ms
    64 bytes from mediatemple.net (64.207.129.182): icmp_seq=2 ttl=49 time=154 ms
    64 bytes from mediatemple.net (64.207.129.182): icmp_seq=3 ttl=49 time=154 ms
    64 bytes from mediatemple.net (64.207.129.182): icmp_seq=4 ttl=49 time=154 ms
    
    --- www.mediatemple.net ping statistics ---
    4 packets transmitted, 4 received, 0% packet loss, time 3004ms
    rtt min/avg/max/mdev = 154.155/155.282/158.321/1.802 ms
    

    You can immediately see the difference in performance. For that single TCP/IP connection to the USA from the UK, it took 156ms, 62 times more than to a server in the UK. Which means that before you even try anything, the maximum throughput you can achieve on Siege in a second is going to be around 6 transactions per second, due to latency alone.

    Lets put this to the test then …

    [~]$ siege http://mediatemple.net/ -c 1 -t 10S -b
    ** SIEGE 2.66
    ** Preparing 1 concurrent users for battle.
    The server is now under siege...
    
    Lifting the server siege...      done.                                                                                                                                                                                
    Transactions:                      31 hits
    Availability:                 100.00 %
    Elapsed time:                  10.37 secs
    Data transferred:               0.12 MB
    Response time:                  0.33 secs
    Transaction rate:               2.99 trans/sec
    Throughput:                     0.01 MB/sec
    Concurrency:                    0.99
    Successful transactions:          31
    Failed transactions:               0
    Longest transaction:            0.71
    Shortest transaction:           0.31
    

    Well, a bit disappointing there, we’ve only managed to load 3 transactions per second. But this could be down to a slow-to-render dynamic web page, so lets test a static element (http://mediatemple.net/_images/searchicon.png) of the same size instead (4KB).

    [~]$ siege http://mediatemple.net/_images/searchicon.png -c 1 -t 10S -b
    ** SIEGE 2.66
    ** Preparing 1 concurrent users for battle.
    The server is now under siege...
    Lifting the server siege...      done.                                                                                                                                                                                
    Transactions:                      32 hits
    Availability:                 100.00 %
    Elapsed time:                  10.46 secs
    Data transferred:               0.10 MB
    Response time:                  0.32 secs
    Transaction rate:               3.06 trans/sec
    Throughput:                     0.01 MB/sec
    Concurrency:                    0.97
    Successful transactions:          32
    Failed transactions:               0
    Longest transaction:            0.48
    Shortest transaction:           0.30
    

    Well, sadly, that’s the same too. But perhaps it is the host that is the problem, lets try another website in the US and use a much smaller file (0.06KB) and see.

    [~]$ siege http://www.wiredtree.com/images/arrow.gif -c 1 -t 10S -b
    ** SIEGE 2.66
    ** Preparing 1 concurrent users for battle.
    The server is now under siege...
    Lifting the server siege...      done.                                                                                                                                                                                
    Transactions:                      50 hits
    Availability:                 100.00 %
    Elapsed time:                   9.89 secs
    Data transferred:               0.00 MB
    Response time:                  0.20 secs
    Transaction rate:               5.06 trans/sec
    Throughput:                     0.00 MB/sec
    Concurrency:                    1.00
    Successful transactions:          50
    Failed transactions:               0
    Longest transaction:            0.20
    Shortest transaction:           0.19
    

    Much better this time, but still under the predicted figure of 6 TPS. But unfortunately, this is always going to be the case. The latency will always prove to ruin any concurrency test even if the remote server is capable of much more. Lets repeat the exact same test from a server in the USA to see how latency really affected the test. First up a quick ping,

    [~]$ ping www.mediatemple.net -c 4
    PING www.mediatemple.net (64.207.129.182) 56(84) bytes of data.
    64 bytes from mediatemple.net (64.207.129.182): icmp_seq=1 ttl=52 time=62.8 ms
    64 bytes from mediatemple.net (64.207.129.182): icmp_seq=2 ttl=52 time=62.9 ms
    64 bytes from mediatemple.net (64.207.129.182): icmp_seq=3 ttl=52 time=62.9 ms
    64 bytes from mediatemple.net (64.207.129.182): icmp_seq=4 ttl=52 time=62.9 ms
    
    --- www.mediatemple.net ping statistics ---
    4 packets transmitted, 4 received, 0% packet loss, time 3067ms
    rtt min/avg/max/mdev = 62.872/62.922/62.946/0.029 ms
    
    [~]$ siege http://mediatemple.net/_images/searchicon.png -c 1 -t 10S -b
    ** SIEGE 2.72
    ** Preparing 1 concurrent users for battle.
    The server is now under siege...
    Lifting the server siege...      done.
    
    Transactions:                     73 hits
    Availability:                 100.00 %
    Elapsed time:                   9.62 secs
    Data transferred:               0.22 MB
    Response time:                  0.13 secs
    Transaction rate:               7.59 trans/sec
    Throughput:                     0.02 MB/sec
    Concurrency:                    0.99
    Successful transactions:          73
    Failed transactions:               0
    Longest transaction:            0.14
    Shortest transaction:           0.12
    

    So there you have it, we’ve managed to double our transactions per second, without any server-side changes simply by using a server closer to the test site – showing how sensitive Siege is network latency.

  4. Siege is going to be limited by the bandwidth available on your test server and the remote server. So once you start hitting higher levels of throughput, the amount of content being downloaded starts to go up. In our examples above, 0.02MB was downloaded in 10 seconds – which is a tiny 0.16 Mbps (megabits per second). But when you start to increase the number of concurrent users, things can change radically and it is very easy to saturate the network connection – long before the server itself has reached its capacity.

    So if the server you were testing from only had 20Mbit of usable bandwidth, you would probably see a maximum of about 500 req/s on the 4Kb resource mentioned earlier.


  5. So what should you use to test

    When testing locally
    By all means use Siege, httperf or AB – they each have their limitations, but can generally give you a comparable view of PHP-only concurrency support. Unless you looking to test any hardware further upstream than your server; like testing your firewalls, routers, switches – you should be testing locally. You need to rule out all other possible influences/bottlenecks.

    When testing remotely
    By all means, use any tool listed above – but don’t try and compare the result to what you saw when testing locally. You’ll still have a TPS rate that you can use as a reference – but it isn’t accurate or comparable to what you would see if actually testing locally. Also note, for the reasons listed above – you are very likely to experience a latency or capacity bottleneck – that will render your results fairly worthless.

    Ultimately, don’t concern yourself with high concurrency, remote load testing – if that is a concern – speak to a specialist Magento web host who will do all the required specification and testing required to meet your requirements.


    We recommend

    Where you should focus your attention is on making the site that little bit faster for a single-user, the real-world page load time perceived by a human. For this, there are some great on-line test tools that will give you a great rounded impression of your site actually performs.

    We use JMeter for all our local load testing and for complex unit testing, its a great all-round application.

    Yottaa Speed TestPingdom FPTGTMetrix

    And remember – caching isn’t king

    When it comes to Magento performance, a cache is just hiding the underlying issue – if you can’t serve a page quickly to begin with, the cache isn’t going to help much. There is always going to be a time when the cache is empty and that first (un-cached) request needs to be made.

    So those using Varnish, Nginx caching, mod_pagecache etc. can easily just buffer the page into a cache and you’ll see sub 20ms render times. If your using any caching application, then using Siege to test performance – you might as well be loading an image rather than category URL, as it’ll give the exact same results. Unless you are specifically trying to test the throughput of your caching application, it should be disabled during the testing process, otherwise it will taint your results with an artificial figure.

    Server side includes being the exception to this, but it certainly goes beyond the scope of this article (remote performance testing with Siege).

  • http://colin.mollenhour.com Colin Mollenhour

    Very thorough explanation. I’d love to see a tutorial for running a real test with jMeter on Magento. jMeter looks like a PITA and Siege is just so easy!

  • sonassi

    @colinmollenhour:disqus – I would doubt I’ll go to the lengths of writing a tutorial for jMeter. Its fairly long winded and could potentially go into far too much detail (for the time I have available for article writing!). But what we plan on doing is making the configuration available for download (for the all-important local testing) – and also accessible via a web app (with a PoP on multiple continents).

  • http://colin.mollenhour.com Colin Mollenhour

    Awesome, that’d be much better than a tutorial anyway, methinks.

  • molotovbliss

    I’ll bite… First your title is a bit inaccurate, it should read “Why siege isn’t an accurate tool for testing *remote* magento performance” it works great locally.

    Static items don’t mean anything to me, if your not loading them on a CDN or using a light weight http server and multiple domains for each element to cut down on max number of connections in browsers then there isn’t much you can do as its simply a bandwidth issue between the end user and the host of the static files.

    As for Siege, Apache benchmark, etc. they are merely tools to get an “Idea” of how both the dynamic and cached versions will hold up under load. Obviously its best to keep monitoring after launch with tools like Munin and mytop, and htop to ensure your not bottle necking. I’ve setup Nginx > Varnish > LAMP (eAccelerator I hate APC) stack with memcache between Apache and MySQL and the performance of all of them together was next to none I had configured before. JMeter is nice if you can stand the tedious steps it takes to setup, me personally I don’t have the time unless a clients willing to pay for such.

    Also, Its fairly easy if your running varnish or nginx to simply point siege to the “dynamic” backends and test loads. With that said, everything is cached these days, with peoples realization of instant gratification (think instant breakfast, fast food, instant facebook access on their phone) seconds matter to end users, its a sad but true human element that has occurred.

    Pingdom is horrible IMO, its the same as testing a remote server with Siege or any other tool, I don’t trust any 3rd party (remote) tool to accurately calculate any type of load unless I’m sitting on the server itself, not too mention I had to flag emails from down time from a client long gone as there is 0 ways to disable them without logging into the original account.

    Ironically this all reminds me of Tuning cars, people bitch and complain all the time about Dyno numbers being off -/+ on HP and Torque compared to other Dynos like a Mustang dyno. In the end its just a baseline to gauge, obviously logging real time data and dissecting this data is going to be the best weapon in tuning, both cars and Magento :)

    Great article however, and this is just my opinion.

  • sonassi

    @molotovbliss:disqus – Absolutely, you’ve hit the nail on the head, hence why I opened with “Siege is great” – the article really is aimed purely at remote testing with Siege, specifically comparison of remote tests with Siege. As long as you understand its limitations and purpose – you’re fine. But there is certainly an air of confusion out there that isn’t helping those “not in the know” make informed decisions.

    I’m suprised about your remark on Pingdom. Although, I think you are referring to it in its capacity as a monitoring tool – not performance tool. However, we use it as 1 of 3 external monitoring tools for our core network and haven’t had any false positives in 4 years. The tool I mention that they offer, is the “Full Page Test” which is really a replication of FireFox’s FireBug – but with a pretty web-based GUI.

  • sonassi

    Watch this space … you’ll be able to grab it in about 6 weeks from http://www.magebench.com

  • Pingback: Delicious Bookmarks for January 11th from 12:48 to 22:59 « Lâmôlabs