A load test either achieves a certain goal, or it dies trying. If your computing world crumbles half way to that goal, then your performance meters should contain some good clues as to what to fix, reengineer, or upgrade before you try again.
The goals for the load test are often based on what your computing world has handled in the past, plus an increase for the projected growth.
All load test goals boil down to just a few questions. How hard are you going to push your computing world? What response time is acceptable? What error rate is acceptable? What internal meters will you collect?
How Hard To Push
If what you are testing is user visible, then you may frame your goals for how many virtual users will be simultaneously supported. In real life, each user will submit work at a given pace and, for multi-step transactions, usually wait between steps as they read and think about the results of the previous step. This leads to two different ways of counting virtual users: concurrent and simultaneous.
Concurrent virtual users are connected to the system and are requesting work at some regular interval. Simultaneous virtual users are all requesting work at the same time.
If your computing world was a bar, then the concurrent virtual users would be all the patrons in the bar, and the simultaneous virtual users would be the patrons who are currently asking the bartender for a drink.
If you were running a load test as a stress test where the wait time was zero, then concurrent virtual users equal simultaneous virtual users as every patron would be guzzling each drink served and immediately requesting a new one.
If your boss wants the goal stated in terms of the number of simultaneous transactions, or users, then all you need to do is set the think time to zero and use enough virtual users to match the goal number, plus a few extra. What are the extra for? Once a virtual user finishes a transaction, it takes some non-zero time to report the results and reset itself to go again. You can figure out the right number of extra virtual users when you are doing the low-power test validation testing. The shorter the transaction is in relationship to the reset time, the more extra virtual users you will need to achieve your goal.
In the example above, the average transaction time and the reset time are both about the same. Each virtual user will only spend half of its time keeping your computing world busy.
If your boss wants the goal stated in terms of number of concurrent virtual users, then just start that number of virtual users plus a few to handle the resetting and reporting downtime, as mentioned above.
When defining the goal you might not have a way to directly measure the number of concurrent virtual users. In that case you can estimate the number concurrent virtual users in your computing world if you know the average number of sessions (i.e., the time a user is considered concurrent) and the average session duration, using Little’s Law:
ConcurrentUsers = NumOfSessions * AvgSessionDuration
First find the total number of sessions in a peak hour. Then convert the average session duration to units of an hour.
0.05 hours = 180 seconds / 3600 seconds per hour
Then multiply these two values together to find the number of concurrent virtual users.
Here is the same calculation applied to the physical world.
Suppose you are building a fast food restaurant and you want to serve 200 people per hour at peak. You also know from previous experience that the average diner spends 15 minutes sitting at a table in this type of restaurant (0.25 hour session duration). How many chairs will you need? 200 * .25 = 50 chairs Those chairs represent the concurrent users, or in this case, concurrent diners.
If your boss wants the goal stated in transactions per second then your job is easy, as during the low-power test validation testing, you’ll get a good idea of how many virtual users you need to generate a throughput of X Transactions/second.
Measuring throughput without looking at response time is foolish. If you allow infinite response times, then any computer can handle any load. When monitoring response times in a typical load test, you will see this normal progression as the load increases:
- At a low load the response time looks fine.
- As you increase the load, at some point the response time will start to climb as a key resource bottlenecks.
- If you push hard enough the response time will either keep climbing, or start dropping as transactions fail.
The odd thing about response time is that sometimes it is faster to fail, than to succeed. It can be much faster to return an error like: “Zoiks! Database lookup failure” than perform a long, complex query. Once an application is warmed up (processes started, key files in cache, etc.) I’ve never seen a case where adding more load improved response time. My rule for performance work is: If the response time is improving under increased load, then something is broken.
For load test goals you need to define an upper limit for acceptable response time. Once you hit that number, then it is pointless to push your computing world harder. You may want to give some thought to how you specify that number. Response time matters a lot to users and how you specify the number will determine the acceptable number of suffering users at peak. You can specify response time as:
- No response time will exceed X seconds.
- The average response time will not exceed X seconds.
- 95% of transactions will take less than X seconds.
The first option (“No response time will…”) is very strict and will cost you lots of money to buy all the additional hardware needed to keep the response time for every transaction under this limit. Also, if part of your transaction path crosses the Internet, then that part of the path is totally out of your control. A former colleague often says:
“Bad things happen on the Internet”.
The second option (“The average…”) is much easier to hit, but it has a problem in that many users will still be suffering. Depending on the distribution of response times, this could be half of your users. That’s a lot of unhappy users to plan for.
The third option (“95% of transactions …”) is your best bet. This lets you run your computing world harder than the first option and allows less unhappy users than the second option. If you prefer, you can pick a different number than 95%. Some people like 98%, some people like 90%. Just as long as the boss accepts this number, all is well.
Quality of Results
Your load test tool has to have a way of evaluating the quality of responses it gets back from your computing world. Simulating high volume user load can be a tricky business fraught with error. Applications will break under load, security checks will start getting in the way, third parties will have problems, and bad things will start happening on the Internet. The ultimate goal of any load test is to have your computing world smoothly handle a simulated peak load. The term “handle” does not mean to return useless nonsense in a fast an efficient manner. Your load test tool needs to check the retuned results by looking for problems, error messages, and/or missing data.
Be sure to collect all the internal metering data you can. During a load test, the load can be held very steady. You can choose the exact mix of transactions, so performance meters can be calibrated (At 1000 TX/sec the bamboozle/sec meter = 4000), and explored (transaction X has twice the effect on this meter a transaction Y), with greater ease and precision than when metering a live user load.