A Thousand Thanks

Late last night some nice person bought the 1000th copy of my book. Wow!1000thankyous

I’d like to thank each and every one of my readers. When I started this book I did it to give back to the field of performance analysis. I’d hoped to sell 500 copies. Why 500? Since I can’t find everyone who might need to read this book, I decided to be happy with 500 as that is the number of students I’d have in an average teaching year when I was teaching the performance course at Stratus. Apparently I aimed too low.

The best part of this experience has been hearing from my readers. Their personal messages to me, comments on this blog, and recommendations of my book have been a joy to read. Yes, even the comment that kindly pointed out my typo in chapter eight where I wrote “pubic speaking” when I meant “public speaking.”

 

How To Fly

During my career I flew over a million air miles. Even though I’m just shy of seven feet tall, I mostly enjoyed the experience once I figured out a few key things about flying. I hope these insights help you. planes

Attitude is everything in flying

People are attracted to (and want to help) grateful, kind, and pleasant people. Think about your own life. When you have served others, what kind of person did you bend rules and go the extra mile for? When a problem happens, it is very rare that the ungrateful, unkind, and unpleasant person gets to their destination any faster than the kind person.

Consider the alternative

Regardless of how many things go wrong on a trip, flying is so much better than a bus. Until transporter technology is perfected, flying (even with all its hassles) is really your best high-speed choice for transport.

Air travel is like prison, but in a good way

Flying (especially after 9/11) is just about a total surrender of your civil rights and any illusion of control you might have. Realize that when you fly you make a trade: You surrender almost all your rights and they move you across the planet at over 500mph. It is only a good trade if you accept both sides of the deal. If you don’t, then don’t fly because many people will suffer as you hold up the security line.

All airlines are great and all airlines stink

Amongst the people I know who have flown more than a million miles they all have a favorite airline they LOVE and an airline that they HATE. The interesting thing is that they all love/hate different airlines. Even though collectively we have lots of data points (flights) there is no consensus. These love/hate feelings are often rooted in just a few good/bad incidents. On any given day a given airline is either awful or glorious.

Airline employees

Airlines are huge companies so don’t expect perfection from all 120,000 employees. For all large groups there is often a bell-shaped curve of performance. A few do wondrous work, the vast bulge in the middle do what is expected, and a few at the other end take sadistic pleasure in creating a private hell just for you. It’s a crapshoot who you will meet and if they are having a bad/good day.

  • Flight attendants have no power to change anything about the flight, but they do occasionally bring you an extra cookie.  Help them by staying in your seat during meal times and taking your seat quickly when asked to do so.
  • Ticket/Gate agents have almost no power to give you extra perks, but they do have the awesome power of not offering you the help you haven’t specifically asked for. They are the most yelled at employees of any airline. Never yell at them because a flight is delayed, canceled or otherwise screwed up. It is not their fault. Although on general principles I believe in treating people nice because it is the right thing to do, I can assure you that nice people are offered more choices and are sometimes upgraded. It pays to be genuinely nice.

Connections

Almost all flights connect. Choose your connecting flights so you have options and time. If possible, always avoid a connecting flight that is the last one of the day to your destination. When booking a flight, your various flight options are usually sorted so the connecting flight that leaves as soon as possible after your first flight arrives is at the top of the list.   Often that is a tight (not much time to run from plane to plane) connection. Why sweat a tight connection when you can leave a little earlier? I personally like a four-hour connection and in the last 10 years of my business flying I never missed a connection. Not one.

It is always sunny at 35,000 feet

If you like to look out the window, the view is much better if choose the seat that is on the “shady” side of the plane – where the sun is behind you. Think about the flight direction and time of day to figure that out. The “A” seats (as in seat 23A) are on the left side of the plane.seating

Leave early

If (as I have heard so many people loudly proclaim) missing this flight will make you miss some critical event (interview, meeting, wedding, birth, …) then you are a fool for cutting it that close. All airlines have three major partners that they have no control over: the Federal Aviation Administration, Homeland Security, and Mother Nature. If your are traveling for a once-in-a-lifetime, super-important reason, leave two days early. Three days early if it is your wedding. Really.

When trouble strikes

When you fly, and things are not going well, there are a few key facts-of-life you must understand to rationally evaluate what your options are.

  1. There is no spare plane. The cost of keeping a spare 100 million dollar plane sitting around is very high. Even at huge hub airports there is no spare plane and no spare crew waiting to fly it. If your plane breaks then either the passengers are spread out over other flights or the airline cancels some other flight and assigns the plane to your flight.
  2. Flight status displays lie right up to the last minute. The airlines typically show the flight status of your flight as “on time” right up to the last minute. To get a better idea of your probability of flying on time look at the arrivals monitor for the flight that lands at the gate your flight is scheduled to depart from. Typically that flight arrives about an hour before your departure. If it’s delayed…the probability that your departing flight will be delayed goes way up.
  3. Insignificant weather matters. When the airlines say “bad weather at the destination” is causing the delay sometimes it is violent weather. But most times it is just the local conditions that lower the airports overall capacity to move airplanes:
    • Unusual wind conditions can force the airport to use a set of runways that has less capacity for takeoffs and landings.
    • Visibility can be just bad enough so that they have to switch to a different set of flight rules that either further spaces out takeoffs and landings or it prevents planes from simultaneously landing on parallel runways.
  4. Planes do not hurry. Once in the air, the captain of a delayed flight will often say something like “We will do what we can to make up time.” What they can do is basically nothing. Fuel is expensive and going just a little faster burns a lot more fuel. Also the difference between the cruise and max speeds, for several commonly used jet aircraft, is less than 10%. If you get in the air late, you will arrive late.

If you think about each one of the above facts of life you might see analogs in the computer performance work you do.

When you are stucksnow

When a huge storm shuts down the airport… accept your fate. The only rational thing to do is to ride the chaos with grace and style. Stay flexible, stay pleasant, and be helpful to others. Plan to convert this dreary experience into a great story. You can write the story of The Massive Airport Blizzard to read either:

  1. I yelled at dozens of people to no effect and got home two days late.
  2. I had some really interesting conversations, helped someone, made a new friend, and got home two days late.

It is your choice. Choose to be happy, because grumpy rarely works for anyone.


I also have many useful hints about doing computer performance work once you land at your destination in: The Every Computer Performance Book which is available at Amazon, B&N, or Powell’s Books. The e-book is on iTunes.


 

Three Tools You Should Build

Given that it is a good idea to keep an eye on performance all the time, there are lots of companies that only allow you pay periodic attention to performance. They focus on it when there is a problem, or before the annual peak, but the rest of the year they give you other tasks to work on.

toolsThis is a lot like my old job in Professional Services – A customer has a problem, I fly in, find the trouble, and then don’t see them until the next problem crops up.

To do that job I relied on three tools that I created for myself and that you might start building to help you work on periodic performance problems.

Three Tools

List All – The first tool would dig through the system and list all the things that could be known about the system: config options, OS release, IO, network, number of processes, what files were open, etc. The output was useful by itself as now I had looked in every corner of the system and knew what I was working on. Several times it saved me days of work as the customer had initially logged me into the wrong system. It always made my work easier as I had all the data I needed, in one place, conveniently organized, and in a familiar order.

Changes – If I’d been to this customer before, this tool allowed me to compare the state of the system with the previous state. It just read through the output of the List All I’d just done and compared it with the data I collected on my last visit. Boy, was this useful as I could quickly check the customer’s assurance that “Nothing had changed since my last visit.” I remember the shocked look on the customer’s face when I asked: “Why did you downgrade memory?”

Odd Things – Most performance limiting, or availability threatening, behavior is easy to spot. But for any OS, and any application, there are some things that can really hurt performance that you have to dig for in odd places with obscure meters. These are a pain to look for and are rare, so nobody looks for them. Through the years as I discovered each odd thing, I would write a little tool to help me detect the problem and then I’d add that tool to the end of my odd things tool.  I’d run this tool on every customer system I looked at and, on occasion, I would find something that surprised everyone: “You haven’t backed up this system in over a year.” or solved a performance problem by noticing a foolish less than optimal configuration choice.

With most everything happening on servers somewhere in the net/cloud these days, knowing exactly where you are and what you’ve got to work with is important. Being able to quickly gather that data in a matter of minutes allows you to focus on the problem at hand confident that you’ve done a through job.

Stone_SoupAll three of these tools were built slowly over time. Get started with a few simple things. The output of all three is just text – no fancy GUI interface or pretty plots are required. When you have time, write the code to gather the next most useful bit of information and add that to your tool.

Just like the old folk story of stone soup, your tools will get built over time with the contributions of others. Remember to thank each contributor for the gifts they give you and share what you have freely with others.


Other useful hints can be found in: The Every Computer Performance Book which is available at Amazon, B&N, or Powell’s Books. The e-book is on iTunes.


 

 

The Sample Length of The Meter

Any meter that gives you an averaged value has to average the results over a period of time. If you don’t precisely understand that averaging, then you can get into a lot of trouble.

The two graphs below show exactly the same data with the only difference being the sample length of the meter. In the chart below the data was averaged every minute. Notice the very impressive spike in utilization in the middle of the graph. During this spike this resource had little left to give.dailypeak1

In the chart below the same data was averaged every 10-minutes. Notice that the spike almost disappears as the samples were taken at such times that part of the spike was averaged into different samples. Adjusting the sample length can dramatically change the story.dailypeak2

Some meters just report a count, and you’ve got to know when that count gets reset to zero or rolls over because the value is too big for the variable to hold. Some values start incrementing at system boot, some at process birth.

Some meters calculate the average periodically on their own schedule, and you just sample the current results when you ask for the data. For example, a key utilization meter is calculated once every 60 seconds and, no matter what is going on, the system reports exactly the same utilization figure for the entire 60 seconds. This may sound like a picky detail to you now, but when you need to understand what’s happening in the first 30 seconds of market open, these little details matter.

Below you will see a big difference in the data you collect depending on how you collect and average it.  In the one-second average (red line) you are buried in data. In the one-minute average (sampled in the yellow area) you missed a significant and sustained peak because of when you sampled. The 10-minute average (sampled in the green area) will also look reassuringly low because it averages the peaks and the valleys.avg3

Take the time, when you have the time, to understand exactly when the meters are collected and what period they are averaged over. The best way to do that is to meter a mostly idle system and then use a little program to bring a load onto the system for a very precise amount of time and see what the meters report. The better you understand your tools, the more precisely and powerfully you can use them.


This hint and many others are in: The Every Computer Performance Book which is available at AmazonPowell’s Books, and on iTunes.


 

Thank You

I’d like to take a moment and thank the nice people who have bought my book. For the first time today (July 5, 2014)  the book cracked Amazon’s Top 100 books in computer science. I am deeply grateful.yes

Working A Little Harder

Sometimes you have to work a little harder than you’d like to find the data you need to solve a performance problem or answer a performance question. Sometimes you have no elegant tool and just have to jury-rig some ugly collection of hacks to get what you need.data collection

 

The photographer pictured above was not comfortable, safe, or delighted to have this assignment at the this moment. However, I bet to the end of his days he told this story with great pride. If you don’t have everything you need in its most convenient place, just at the right time… do what you can, with what you got. Yeah, it’s a pain, but it is also the start of a great story.


For more specific hints on getting performance work done see: The Every Computer Performance Book at  AmazonPowell’s Books, and on iTunes.


No Bad Surprises In Public

Never plan to surprise the person responsible for a problem in a public meeting. The goals of performance work are measured in response time and throughput, not in how much drama you create when you point your accusing finger at the unsuspecting culprit.    drama

When you locate a problem, the first person you should find is the person who is responsible for that part of the computing world, and discuss that problem with him or her. Why? That person may know a lot more about that part of your computing world than you do, and may have further insights as to the root cause and the reason(s) why things are done this way. Often, I find that when I privately share my concerns and ask for help in crafting a list of possible solutions, that person is quite willing to be helpful.

I have made the mistake of not involving the person I believed was responsible for the problem and have suffered these consequences, usually in this exact order:

  1. The person responsible for that part of the computing world got angry and defensive and worked relentlessly to tear down my work and credibility.
  2. That person points out my ignorance and further points out the real problem is caused by some other part of the computing world owned by a different person. Now there are two angry people in the room.
  3. Now the manager becomes angry with me for creating tension among the staff.

It always works better when I talk to the responsible person privately well before I write up my recommendations. We look at the problem and explore solutions. Then I can walk into the meeting and say something like: “The problem is in this subnet. With the help of your networking guru, we have a few ideas on how to improve the situation.


For more hints on presenting your work see: The Every Computer Performance Book at  AmazonPowell’s Books, and on iTunes.