# Inverse Square Root II

**Posted:**November 25, 2012

**Filed under:**Bit twiddling, Floating Point Leave a comment

Hmmm, after writing that rather long and laborious post about the bit-level interpretation of fast inverse square root, I came across a couple of excellent blog posts by Christian Plesner Hansen:

http://blog.quenta.org/2012/09/0x5f3759df.html

http://blog.quenta.org/2012/09/0x5f3759df-appendix.html

which got me thinking about all this again (it’s a real mind-worm this one)…

So, now the scales have dropped from my eyes, let’s try again: what we are really doing is using a rough approximation to log2(n), represented as a fixed point number, that happens to be easily computed from the floating point representation. Once we have our approximate log (and exp) function, we can do the the usual computations of roots etc. using normal arithmetic operations on our fixed point representation, before converting back to a float using our exp function.

So, take a number, 2^{n}(1+m) where 0<=m<1, a reasonable approximation to log_{2}(x) is n+m, and we can improve the approximation by adding a small constant offset, σ and, because we are doing this in the fixed point realm, everything works out nicely when we convert back to the floating point realm. Here is a picture for n=0, choosing by eye a value of σ = 0.045:

Now, if interpret a non-negative IEEE-754 float 2^{n}(1+m) as a 9.23 fixed point value (ie. with 9 bits to the left of the binary point, 23 bits to the right), then this fixed point number is (e+m), where e = n+127, and, as above, this is approximates log_{2}(2^{e}(1+m)), so e-127+m = n+m is an approximation to log_{2}(2^{e-127}(1+m)) = log_{2}(2^{n}(1+m)), ie. log_{2} of our original number. Note that e+m is always positive, but e+m-127 may not be, so might need to be represented as a signed value.

As far as actual code goes, first we need to be able to get at the bitwise representation of floats. It’s nice to avoid aliasing issues etc. by using `memcpy`

; my compiler (gcc 4.4.3) at least generates sensible code for this:

static inline float inttofloat(uint32_t n) { float f; memcpy(&f,&n,sizeof(f)); return f; } static inline uint32_t floattoint(float f) { uint32_t n; memcpy(&n,&f,sizeof(n)); return n; }

We are working in our fixed point representation, so adding 127 is actually done by adding 127<<23 = 0x3f800000 and we convert our σ offset in the same way:

uint32_t sigma = 45*(uint32_t(1)<<23)/1000; uint32_t c = (uint32_t(127)<<23) - sigma; int32_t alog(float x) { return floattoint(x)-c; }

We also want an inverse of our approximate log, ie. an approximate exp, but this is just a reverse of the previous step:

float aexp(int32_t n) { return inttofloat(n+c); }

Provided we haven’t left the realm of representable floats, the sign bit of the result should alway be zero.

We can use our log approximation directly as well of course, to convert from fixed point we need:

float unfix(int32_t n) { return float(n)/(1<<23); } float unfix(uint32_t n) { return float(n)/(1<<23); }

Another nice picture, showing our `log2`

approximation over a wider range:

Now we can define:

float sqrt(float x) { return aexp(alog(x)/2); } float invsqrt(float x) { return aexp(alog(x)/-2); } float cubert(float x) { return aexp(alog(x)/3); }

and so on.

To relate this to the original magic function, for an inverse square root, the integer manipulations are:

invsqrt(n) = -((n-c)/2) + c = -(n/2 - c/2) + c // Not exact, but close enough = c/2 - n/2 + c = -(n>>1) + ((c>>1) + c)

Calculating (c>>1) + c we get 0x5f375c29, satisfyingly close to the original magic constant 0x5f3759df…

We can use signed or unsigned fixed point numbers for our logs, the arithmetic will be the same, providing we can avoid overflow, so for example, if to get an approximate value for `log(factorial(50))`

, we can do:

uint32_t fact = 0; for (int i = 1; i <= 50; i++) { fact += alog(i); } printf("%g\n", unfix(fact));

giving the result 213.844, comparing nicely to the true result of 214.208. Note that if I were to have used a signed value for fact, the result would have overflowed.

Be warned, this is only a very rough approximation to the log2 function, only use it if a very crude estimate is good enough, or if you are going do some further refinement of the value. Alternatively, your FPU almost certainly uses some variation of this technique to calculate logs (or at least an initial approximation) so you could just leave it to get on with it.

Like many things, this isn’t particularly new, the standard reference is:

J. N. Mitchell, “Computer multiplication and division using binary logarithms,” IRE Trans. Electron. Computers, vol. 11, pp. 512–517, Aug. 1962