Load Testing: Different Approaches

tuningThe way you tune the settings of your load test is determined by what questions you need answered. It is not just about the next big peak. There are several different kinds of load tests: Load Testing, Isolation Testing, Stress Testing, and Endurance Testing. Each one of these can tell you interesting things about your computing world.

Load Testing

A load test applies a realistic external load that simulates the anticipated peak to directly measure response time, throughput, and other key internal performance meters. A load test can show you under what load the response time and throughput will start to get ugly.

A load test can also validate your capacity planning efforts. For example, when the capacity plan you just completed says your site can handle 500 things/second:

  • The load test might run just fine at 500 things/second, and all key resources report the predicted utilizations and throughput rates. All is well.
  • The load test might run just fine at 500 things/second, but the utilizations of key resources were well below what the capacity plan projected. You better double check the math on your capacity plan and double check to make sure that the load test ran as expected.
  • The load test might only get to 340 things/second before some unexpected resource bottlenecks. Fix that bottleneck, add that resource to the capacity plan for next year, and load test again.

A load test can also allow you to calibrate your meters, and other performance information gathering tools, under a very stable load. For example:

  • At 100 TX/sec the bamboozle/sec meter = 400
  • At 500 TX/sec the bamboozle/sec meter = 2000
  • At 1000 TX/sec the bamboozle/sec meter = 4000
  • So now you have multiple data points showing that each TX (on average) generates four bamboozles
  • Now you can use the bamboozle/sec meter any time as a workload meter by dividing it by four, even though you have no clue what a “bamboozle” is.

Isolation Testing

In a load test you control the mix of transactions, so you can feed your computing world a pure stream of only one kind of transaction, rather than the mix of transactions the users usually generate. This can be a useful way to explore your computing world.

First, notice if there are any systems or resources that you did not expect this type of transaction to use?  If a performance meter surprises you, you have more work to do.

If you are hunting a problem, or checking to see if some change to your computing world made a difference in performance, isolation testing can help.  By testing the major transaction types separately, you might find that only the X transaction has dramatically slowed down or that only the Z transaction causes the problems you are seeing.

An isolated test of a transaction made of distinct parts (e.g., a web-based transaction that visits the home page, searches, and then puts that thing in a cart) can show you clearly which part of the longer transaction is having performance troubles under what load.

webtx

Since you ran this transaction in isolation, the metering data you are getting from your computing world is only showing work generated by this kind of transaction. That makes it easier to find the bottleneck. Clearly from the table above the Search part of this transaction is having some problems at 200 TX/sec and is really hurting at 300 TX/sec.  The performance meters during this load test should give you a big clue as to where the problem lies.

Stress Testing

A stress test is just a load test where you purposely overdrive the system to find its breaking point.  This is done by running a load test with a normal transaction mix, but with way too many users and/or no think time.

scrum

If you run your load test and you achieve your goals it is still useful and interesting to push the system to its breaking point to see exactly how it breaks – so you know it when you see it. Would you want to go to a doctor who had studied medicine carefully, but never seen a really sick person?

Endurance Testing

This is a load test where you study if everything in your computing world can keep running over time, not just for a few minutes.  Typically, the load is well below the peak load, and what you are looking for are things that you can run out of or that don’t scale well.  Common questions to look at are:

  • How much disk capacity is consumed per transaction?
  • How much memory is leaking per transaction?
  • Is there an unknown hard coded limit in the software?
  • Is throughput and response time just as good after the millionth transaction (when files are bigger, databases age, and sorting algorithms are challenged) as it was for the early transactions?

These tests are important to run before software is put into production as stopping and fixing problems in the middle of the day on the live system is usually not an option the company wants to take.

Generating The Load

Champion-Power-Equipment-46533-4000-Watt-196cc-4-Stroke-GasSome collection of computers and software generates the incoming workload for your load test and evaluates if the work is being handled successfully.

Depending on your situation you might build it yourself, or have some company generate the load for you, typically via the Internet.

Here are some things to look for when evaluating your options.

Location

Where you generate the load matters.  The load should flow though as much of your computing world as it would normally. Any part of your computing world that you do not test is, by definition, untested. That untested part is likely to keep you up at night worrying and surprise you, in an unpleasant way, during the peak with its shocking lack of throughput.

If you are doing a stand-alone load test of a small subsystem, then the generated load should come from outside the tested computer(s). Why? First, it takes resources to generate load, and you’d like a clean set of performance data from the tested system(s). Also, if you generate the load on the tested system, then you are not testing the network connections though which the real load will have to flow.

If you are doing an end-user load test, then the load should be generated outside your company and from the locations where your users live. Distance matters on the Internet.

Ease Of Use

The sales pitch for the load test tool will tend to focus on the beauty and the flexibility of how it displays results.  That’s all good, but you’ll spend a lot more time creating and debugging the load test than you will spend running and evaluating it. When selecting a load generation tool carefully note:

  • The ease with which you can create new transactions and modify existing ones.
  • The quality and clarity of diagnostic info you get back when transactions are failing.
  • How easily and rapidly the tool can schedule, stop, and restart tests. Load testing is a team sport. Making people wait for you, and the load testing tool, is never fun.
  • How close to real time do you get the results for transactions started and completed, transaction response time and failure rate. You want the bad news as soon as possible, so you can stop, fix, and restart the test.

Money

moneyGenerating load costs money.  More load, more money.

Budget for the testing you’ll have to do before the big load test, and plan to work though several failures where you have to stop the test, fix something, and restart it.

When To Load Test

When driving, it is best to start applying the brakes when you have enough time to easily avoid disaster. In load testing, it is best to start your efforts when you have enough lead-time to fix any performance problem you uncover. That amount of time is different for every situation as there are many things that will influence your decision as to when to get started.

All organizations have a pace at which they feel comfortable. This is especially true when spending money. A good first question to ask yourself when it looks likely that you’ll have to spend to get though the next peak is “How long did it usually take between decision and delivery of the last few major IT purchases?” Every company is different, and I’ve personally seen behavior ranging from 30 days to almost two years. You need to do your work with enough lead-time to take this into account. Don’t get me wrong, change can happen faster than this, but it is just a lot more pleasant for everyone involved if you take into account the company’s normal pace.

There are many other factors that influence when to test. Weigh each one as you choose the best time to do your work.

  • The annual pre-peak hardware/software freeze
  • The anticipated dates for big infrastructure changes
  • The anticipated dates for the roll out of new websites, applications, or features
  • The last quarter of the fiscal year for key vendors
  • When money is available in your budget

As you will see in the list above, when taking all factors into account, there might not be an ideal time to do the load test.  In that case, pick your battles and begin your work.  With time, all things are possible. In general, earlier is better.


For more info see: The Every Computer Performance Book, on Amazon and iTunes.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s