Lastly I tried an optimization in a frequently executed piece of code: 3 variables of small size needed to be saved in a sort of “stack array” of fixed size. In the first version each variable was stored and retrieved from an individual array.
After packing all info into one single array with shifting and oring (saving two array acceses), the code was significantly faster! Obviously array access is much more expensive than a few bit-ops…
But how expensive is it really? I would guess it takes at least two condition checks for the bounds and one multiplication to calculate the address. Does anyone know exactly?