The most significant bits of a floating point representation are basically a logarithm of the number. Logarithms have this relation between multiplication and addition.
3 hours ago [-]
sethaurus 5 hours ago [-]
This is the core trick in the Fast Inverse Square Root[0], as made famous by the Quake III source code.
It uses a shift (equivalent to dividing) and a subtraction in integer-land to estimate x^(-0.5) in float-land.
I've always hated that it's called an “inverse”. It's not an inverse. The inverse of square root is square. If they had called it “reciprocal”, it would have been clear to me what it does, but “inverse” confused the hell out of me when I first saw it.
“In mathematics, the concept of an inverse element generalises the concepts of opposite (−x) and reciprocal (1/x) of numbers.”
Animats 5 hours ago [-]
Neat. Of course it works for the exponent, but that it's not that far off for the mantissa is unexpected. It helps that the mantissa is normalized.
guyomes 5 hours ago [-]
For the mantissa, the insight is probably that :
(1+e1)*(1+e2) = 1+e1+e2+(e1*e2)
If e1 and e2 are small, then e1*e2 is negligible.
thequux 3 hours ago [-]
Huh, and in the case that e1 and e2 are large, the mantissa overflows into the exponent, multiplying the result by 2.
amelius 3 hours ago [-]
I guess we're lucky that the IEEE 754 committee put the exponent in front of the mantissa. Or maybe they were aware of this trick all along ...
tialaramex 2 hours ago [-]
This arrangement makes ordering easier anyway which is probably why they chose it.
If we have some positive 32-bit integers A and B then if A < B, f32::from_bits(A) < f32::from_bits(B)
Edited to add anecdote:
I actually had a bug in realistic (my Rust crate to implement Hans Boehm's "Towards an API for the Real Numbers") which I only fixed recently, for converting my Real number into a floating point type where I might end up with too many mantissa bits, but then sometimes coincidentally the bottom bit of the exponent is zero, and so instead of increasing the exponent by one I actually overwrite it with a one and that's the same result.
I finally caught that when it occurred to me to do "thorough testing". That is, take all possible 32-bit floats, turn them into a Real, then, turn that Real back into a 32-bit float, this should be a roundtrip, if it's not we've found a bug. There are "only" a few billion of these values to test, a machine can do that.
I fixed the 64-bit routines for the same bugs, but I don't have enough years to run all the 64-bit tests the same way and discover any that aren't paralleled.
[Of course there are presumably still bugs in the conversion algorithms which may either be symmetric, and thus the bug doesn't impact a round trip, or they don't affect the binary fractions used exclusively by floats, Real has proper rationals and various other things like the square root of integers, Pi, etc. which we can convert to a float and lose precision but we never get these from converting a float because floats are always binary fractions.]
It's math and not just for integers. That's also how slide rule works.
a×b = 10^(log(a×b))
log(a×b) = log(a)+log(b)
thus a×b = 10^(log(a)+log(b))
Sharlin 5 hours ago [-]
Yes, but what is nontrivial here is how good the approximation is, despite the mantissa part being linear (much better than the upper bound of being within 2x of the real value).
shiandow 4 hours ago [-]
Nontrivial yes, but it's not that complicated to figure out that floats are a piecewise linear approximation of the base 2 logarithm.
Edit: or base 2^k, depends a bit how you view it.
rowanG077 3 hours ago [-]
This can be very worth it in circuit design for custom accelerators. Floating point operation is often multicycle. If you can get close enough with addition it will save a ton of resources and probably also simplify the design.
It uses a shift (equivalent to dividing) and a subtraction in integer-land to estimate x^(-0.5) in float-land.
[0]: https://en.m.wikipedia.org/wiki/Fast_inverse_square_root
“In mathematics, the concept of an inverse element generalises the concepts of opposite (−x) and reciprocal (1/x) of numbers.”
(1+e1)*(1+e2) = 1+e1+e2+(e1*e2)
If e1 and e2 are small, then e1*e2 is negligible.
If we have some positive 32-bit integers A and B then if A < B, f32::from_bits(A) < f32::from_bits(B)
Edited to add anecdote:
I actually had a bug in realistic (my Rust crate to implement Hans Boehm's "Towards an API for the Real Numbers") which I only fixed recently, for converting my Real number into a floating point type where I might end up with too many mantissa bits, but then sometimes coincidentally the bottom bit of the exponent is zero, and so instead of increasing the exponent by one I actually overwrite it with a one and that's the same result.
I finally caught that when it occurred to me to do "thorough testing". That is, take all possible 32-bit floats, turn them into a Real, then, turn that Real back into a 32-bit float, this should be a roundtrip, if it's not we've found a bug. There are "only" a few billion of these values to test, a machine can do that.
I fixed the 64-bit routines for the same bugs, but I don't have enough years to run all the 64-bit tests the same way and discover any that aren't paralleled.
[Of course there are presumably still bugs in the conversion algorithms which may either be symmetric, and thus the bug doesn't impact a round trip, or they don't affect the binary fractions used exclusively by floats, Real has proper rationals and various other things like the square root of integers, Pi, etc. which we can convert to a float and lose precision but we never get these from converting a float because floats are always binary fractions.]
Addition is all you need for energy-efficient language models - https://news.ycombinator.com/item?id=41784591 - Oct 2024 - 126 comments
a×b = 10^(log(a×b))
log(a×b) = log(a)+log(b)
thus a×b = 10^(log(a)+log(b))
Edit: or base 2^k, depends a bit how you view it.