Contact
Site: US UK AU |
Nexcess Blog

Load Testing: Asking the Right Questions, Part 1

January 26, 2018 1 Comment RSS Feed

In this short series, Kevin Schroeder explains how to keep your website on the rails with proper load testing. Kevin owns consulting firm 10n Software, LLC, and has written several testing frameworks for Magento, Gmail, Twitter, and other applications.

Welcome to Asking the Right Questions, my three-part series about all things load testing. Specifically, it will cover how to prepare your site to weather the eCommerce storm, covering concurrency, types of load tests, and how to build and run them. I will include code samples, real-life examples, and how to best address common pitfalls.

The Wrong Question

It’s a common occurrence for developers. The owner of a website anticipates a spike in web traffic, perhaps due to an upcoming promotion. They ask the natural questions, “Can my site handle the increased traffic? How many thousands of users can my site handle?”

Owners want sites that can “handle X visitors” because more users equal more revenue. This is understandable, but a proper load test measures how well their server handles high numbers of concurrent requests, not just the number of web browsers pointing to the server.  It’s a classic case of what they want distracting them from what they need.

Concurrency 101

A quick-and-easy way to load test is to check how many people visited the site in the last 30 minutes, and then view peak concurrency in the log with:

grep -v "skin\|js\|media\|static" access.log | awk '{ print $4 }' | uniq –c

That command filters the access log to remove all static content, which is not a scaling factor and counts the number of completed requests in a given second.

The results give you enough to calculate average concurrency over a time period:

(average response time in ms)*(peak requests per second) / 1000 

For example, if your average response time is 500ms, and you have 50 requests per second peak in your log files, your concurrency is about 25 requests.

However, this method only shows average concurrency over a period and lacks key specifics. Are the responses clustered in the first 100ms of the second, or in the last? Are there stretches of high concurrency occasionally disrupted by disastrous performance?

Given the missing details and the lack of good tools to find them, one solution is to double your result to account for the missing data. For most websites using somewhere between two and ten servers, this doubling will help account for unknown data, thus creating a more accurate estimate of performance.

And yet it’s far from ideal, and the reason is entropy. Or more precisely, the lack of it.

Entropy, Your Ally

The problem is that load tests skew to the positive when they’re too neat and orderly. Load tests are usually built to follow a particular pattern. In Magento, this pattern is the home page, category, product page, add-to-cart, and checkout. Often, it’s also the same page each time. This has the effect of “cheating” on the load test.

Too much predictability tends to balloon performance and produce inflated results. It makes life too easy for your database, caches, and file systems. The more consistent your data, the better the system can optimize itself. Your job when writing a proper load test is to “sabotage” those optimizations with entropy.

Introducing entropy requires a fair amount of work and a developer skill set, and I have three favorites:

  • Use XPath or CSS post processors to extract category and product URLs from the page, which will retrieve random pages.
  • Add cache-busting random query strings to a certain percentage of requests.
  • Use random pause timers in your test threads to make requests occur at non-predictable times.

Websites don’t run in a vacuum, and users, as much you need them, spread chaos. As a developer building a useful load test, it’s your job to simulate that as best as you can.

Looking Ahead

Keep an eye out for Part 2 and 3 next week. I’ll look at two types of load tests – sizing validation and concurrency validation – and explore the dangers of just throwing hardware at your performance woes. 

Posted in: Webmaster
  • Veenu Punyani

    Thanks Gasper. Looking forward for next part.