In this short series, Kevin Schroeder explains how to keep your website on the rails with proper load testing.
Once more into the breach! For my final entry, I will provide some of my favorite tools for building load tests, as well as how to run load tests to best understand site behaviour under various types of loads.
A full treatise on how to build a good load test is likely beyond the scope of a blog, but here’s my box of pro-tips:
- I wholly recommend JMeter for testing.
- When building tests, try to hand code as little as possible.
- For example, JMeter has a browser proxy you can use to capture a session; capture a few sessions with slight variations.
- Hand-coding tests typically results in more predictable test runs – you don’t want predictability; you want entropy.
- Use Xpaths or CSS selectors to extract content from the page.
- Cover as many pages as possible to defeat the many optimizations found in today’s CPUs, memory, and file systems. Put those pages through their paces!
Whatever you do, try to build tests that don’t just hit the same thing over and over again. Your job is to simulate unpredictable, messy human interactions with entropy.
How long should you run your load test for? Some may say for a few hours if you want to get good numbers, but it depends on your setup.
A better answer is “long enough to give each part of the infrastructure time to affect the test. If you have an hourly CPU-intensive task, make sure the test runs at the same time. If you do a cache-clear every 10 minutes, your test should run during that window.
Running Load Tests
When running a load test, I use a fairly stock implementation of JMeter, but also use a library that pushes individual request meta-data into a Mongo database for analysis. Mongo’s aggregation functionality extracts analytic data like a champ and presents all of it in convenient graphs. And while JMeter allows you to log most of the same information to a file, having it in a structured queryable format is useful.
For the front end servers, I simply use vmstat with a 1-second delay piped to a text file. For most load tests, the critical things to watch are the user, system, idle, iowait, and stolen times, as well as the CPU run queue. I push that data into the MongoDB database, which then graphs the results. In the past and as an alternative to MongoDB, I’ve instead used awk to extract the data and Excel’s graphing capabilities to visualize the results.
For databases, I still use vmstat, but rely more on database counters. On MySQL, these are retrievable using SHOW GLOBAL STATUS. I use a script that polls the database once per second, then pushes that data to my reporting database. I usually graph the CRUD operations, query-cache usage, threads running, connection count, and row operations. If there’s a problem with database performance, one of those graphs will light up.
If the graphs behave, then you’re simply testing raw throughput. In that case, it’s more important to build your concurrency level to create sufficient granularity in your data points and help you diagnose the problem. If the site is supposed to handle 200 concurrent threads, don’t configure your test to reach that level in 5 seconds. Though the computer’s doing the work, it needs a human to extract meaning.
Structure tests in a way that helps extract the best meaning from the data. Watch the 95th to 99th percentile, and take a look even if the site is performing reasonably well. When Black Friday or Cyber Monday arrives, it will be the requests in the 99th percentile that break the site.
Simple math provides an explanation. There is only so much work a CPU can do. For example, your CPU can tops out at 250,000 MIPS. Your 99th percentile request requires 50,000 MIPS. During your normal day, the 99th percentile consumes 20% of your CPU time. However, on Cyber Monday, when your traffic is fivefold, that 99th percentile will consume 100% of your CPU.
But at the end of the day, it’s a judgment call. Will decreasing your average response time provide a better result than the 99th percentile? Maybe. You’ll have to look at the numbers. But know that when it comes to unpredictability, the 99th percentile usually shows what is happening at a smaller scale.
Piercing the Veil
A load test can be a wonderful tool, or a pointless waste of time and resources. If you choose the correct type, instill entropy, push the estimated needs, and build and execute tests correctly, then you’ll find actionable results.
Thanks for reading! If you enjoyed this series, please feel free to comment below or let me know on Twitter. May all your tests ask and answer the right questions!
Kevin Schroeder is the owner of 10n Software, LLC., an independent consulting firm. He wrote the Magium Selenium testing library at https://www.magiumlib.com/. He has worked for the Magento Expert Consulting Group, MagentoU as well as Zend Technologies as a consultant and Technical Evangelist. In his sparse free time, he practices guitar and reads books that have nothing to do with software development.Posted in: Nexcess