Load testing is where you give your computing world an artificially generated workload to see how well it holds up.
To meter the test itself look at your computing world with your normal meters that you run every day. During the test, look for two things.
- Did the load test look a normal load from real users?
- Did the load test hit a problem?
Looking For Normal
You know what the resource utilization looks like for your computing world under a moderate load. Do those numbers look similar to what you see during a moderate simulated load? If key resources are unusually idle or busy during the load test then the load test itself is not doing the best job of replicating a user-generated load. High precision here is not required. If the load test load is close, within 10-20% of the expected user-generated load, that will usually do. However, remember queuing effects. The higher the utilization, the more precise you have to be, because at high utilizations a small increase in utilization can cause a big increase in response time.
Load tests have goals they want to reach that are usually expressed in terms of number of things/sec the system has to handle. Unlike normal user load, a load test generated load starts, stops, and changes at your command. When metering this, you should adjust your metering frequency to capture multiple samples at key parts of the load test and adjust your meter start time so it’s nicely synchronized with the load test. In general, you want your meters to start sampling at the top of each minute and be running before, during, and for a while after the load test. The before and after data can help you determine if anything unusual was happening that might have skewed the results.
If the load test achieved its goals, that’s great. Report how your computing world handled the load with a focus on any resource that looks like it is close to bottlenecking. Point out queues that are getting long, devices with a high utilization, resources that are close to running out of capacity, etc. Since load tests typically don’t simulate all transactions, they tend to under-utilize things. If some resource is close to a limit, you still might recommend adding more just to be safe.
If the load test failed to make its goals, that’s unfortunate. Report how your computing world handled the load with a focus on what bottlenecked during the test and what will bottleneck when the test eventually hits its goal load. How does that work? Let’s look at the data.
The table below has the results of a load test. The goal was 800 TX/sec. Everything was fine at 200 TX/sec, but the test bottlenecked at 400 TX/sec. It is clear that device-Y (at 94% busy) will limit any further significant progress towards our goal.
However, to reach our goal of 800TX/sec, our load test will eventually have to push twice the number of transactions per second through the system. Therefore it is reasonable to assume that every other device in the transaction path will have double the utilization. Take all the utilizations you measured at 400 TX/sec and double them to see what else we would run out of at 800TX/sec.
Device-X will be at 100% busy at your projected peak. That’s never a good thing. Noticing two problems with one load test is smart testing. It will save you time, money and embarrassment. I’d add more device-X and device-Y before I tried another load test.
Looking For Problems
When metering a load test, sometimes there will seem to be a dramatic increase in efficiency where the load test is pushing a lot more transactions through your computing world with dramatically lower resource utilizations. I’m sorry to have to tell you, but this is always bad news.
Computer programs never suddenly get more efficient under a heavy load. What they do is start failing and sometimes it can be faster to fail than to succeed. Returning a simple error message is faster than providing a complex answer. Keep an eye on any reported errors and note any dramatic increase in errors as the load increases.