Bentley Rules for Optimising work : Performance Engineering

Vipul Vaibhaw
3 min readJan 18, 2021

The work of a program on given input is sum total of all operations executed by the program.

In this blog I will be taking notes from the lecture 2 on Performance engineering by MIT OCW.

Algorithm design can produce dramatic reduction in the amount of work it takes to solve a problem, as when O(n log n) replaces 0(n²).

Reducing the work of a program does not automatically reduce its running time, however, due to the complex nature of computer hardware — Caching, Vectorization, Speculation and branch prediction, instruction-level parallelism(ILP).

Nevertheless, reducing the work serves as a good heuristic for reducing overall running time.

Classic Bentley rules for dealing with work can be found here — http://www.new-npac.org/projects/cdroms/cewes-1999-06-vol1/nhse/hpccsurvey/orgs/sgi/bentley.html

The professor divides the proposed New Bentley Rules in 4 categories —

  1. Data Structures —
  • Packing and Encoding — Packing means to store more than one data value in a machine word. Encoding is to convert data values into a representation requiring fewer bits. Example, store dates in a struct format rather than string.
  • Augmentation — Add information to a data structure to make common operations do less work.
  • Precomputation — perform calculations in advance so as to avoid doing them at “mission critical” times. (Look meta-programming here)
  • Compile-time initialisation — (Look meta-programming here)
  • Caching — Store results that have been accessed recently so that the program need not to compute them again. Improves cache hit rates.
  • Lazy Evaluation —
  • Sparsity — Avoid computing zeroes. (Compressed Sparse Row)

2. Loops —

  • Hoisting — loop-invariant code, avoid recomputing loop-invariant code each time through the body of a loop.
  • Sentinels
  • Loop unrolling
  • Loop fusion
  • Eliminating wasted iterations

3. Logic —

  • Constant folding and propagation — Evaluate constant expressions and substitute the results into further expressions, during compilations.
  • Common-subexpression elimination — Avoid computing same expression multiple times by evaluating the expression once and storing the result for later use.
  • Algebraic identities — Replace expensive algebraic identities with equivalents that require less work. instead of comparing square roots compare squares.
  • Short-circuiting — When performing a series of tests, the idea of short circuiting is to stop evaluating as soon as you know the answer.
  • Ordering tests — Inexpensive tests should precede expensive ones. Check the tests which are rarely successful first.
  • Creating a fast path
  • Combining tests — replace sequence of tests with one test or switch.

4. Functions —

  • Inlining — Avoid function call overhead by replacing a call to the function with the body of the function itself. Inlined functions can be as efficient as macros and they are better structured.
  • Tail-recursion elimination — Replace a recursive call that occurs as the last step of a function with a branch, saving function — call overhead.
  • Coarsening recursion

Important Advice —

  1. Avoid premature optimization.
  2. Reducing the work does not necessarily decrease its running time!
  3. The compiler automates many low level optimizations.
  4. Look at the assembly code to tell if the compiler is actually performing a particular optimization.

--

--

Vipul Vaibhaw

I am passionate about computer engineering. Building scalable distributed systems| Web3 | Data Engineering | Contact me — vaibhaw[dot]vipul[at]gmail[dot]com