Lies, damn lies, and benchmarks

Benchmarking is the process of running a specific workload on a specific machine or application and measuring the resulting performance. This technique provides an accurate evaluation of the performance of that machine for that workload.

However, this process is surprisingly difficult to do well, with many opportunities for mistakes and oversights.

In other terms, let’s assume you are interested in buying a car, and you want one that has the best acceleration. Therefore, you want to benchmark how many seconds it takes to go from zero to 60 mph. If you benchmark two cars by putting them up on a rack and spinning the wheels to measure rotation/acceleration, you get one result, but when you actually drive the two cars, you get another result — because one may weigh more than the other, or have better aerodynamics. This is an example of why it is important to understand

There are several approaches you can follow when benchmarking, although, the most common is kindly named fire-and-forget. This technique consists in:

  1. Run the workload tool

  2. Go to grab a coffee, or tea (I don’t drink both, so doesn’t matter)

  3. Go back to the machine when the benchmark concludes

  4. Get the benchmark result and assume that’s accurate

As you can notice by the irony, this technique definitely is not the one I would use.

Benchmark is an art. Throughout this article, some of the mysteries behind benchmarks will be covered and new ones, eventually, will arise. Hence, use this article as a guide. You don’t need to read top-to-bottom — even though I strongly suggest it.

Use the following table of contents (TOC) as your personal compass.

  1. Preparing the Environment
  2. Benchmark Methodology
  3. Evaluating Benchmark Results
    1. Be realistic in your Benchmarks
    2. Benchmark results can tell you more than performance gotchas
  4. Benchmark Pitfalls
    1. Benchmark Tool Limitation
    2. Coordinate Omission