7. Fractions and Floating Point
💀💀💀, yeah these are not fun, but we need them in the real world.
Fractional Numbers
$2024$
Ok, what about read numbers
$2024.320$
Ah there we go, now let's make our lives hard.
Recall: $2024.320 = 2\cdot 10^3+0\cdot 10^2+2\cdot 10^1+4\cdot 10^1+3\cdot 10^{-1}+2\cdot 10^{-2}+2\cdot 10^{-3}$
Now lets do the same with binary:
$0110.1101 = 0\cdot 2^3 + 1\cdot 2^2 + 1\cdot 2^1 + 0 \cdot 2^0 + 1\cdot 2^{-1} + 1\cdot 2^{-2} + 0\cdot 2^{-3} + 1\cdot 2^{-4}$
And to convert from decimal to binary:
So, uuhhh, what about 0.1? Well, we can't represent it without an infinite number of bits:
Also, where do we put the decimal point? Well, we can have accuracy or range or....
We just move the point, oh floating point!
Floating Point
Remember normalized scientific notation? $-0.0039=-3.9\cdot 10^{-3}$.
This makes representing floating points much easier.
IEEE 754
This is the IEEE 754 standard, and although it is a great stardard, it has some tradeoffs. Like it is not communcative, take a (F)eather (small value) and an (E)lephant (huge value). If you do, E-E+F you will get F, since you are jumping from E to 0 then adding the F. If you do, E+F-E, you will jump to E, then E+F (which is really just E), then you will get 0. You lose the ability to differentiate between small and big values.
When jumping values, you scale by a factor of two, aka you lose some precision.
So, just never use floating point unless you know what you are doing.
Now for IEEE 754 for itself, it is in sign-magnitude and the significand is 23 bits because of that (arithmetic HARD).
The exponent is... interesting:
We use biased notation which is $\text{biased representation}=\text{exponent}+\text{bias constant}$. In this case, the bias constant is 127
This gives use values from -126 to 127 (1 to 254 biased), note: 0 and 255 are reserved!
An example to decode the value:
An example to encode the value:
And to add values:
There is also double-precision and half-precision:
Now for those reserved values:
Single precision | Double Precision | Meaning | ||
---|---|---|---|---|
Exponent | Fraction | Exponent | Fraction | |
0 | 0 | 0 | 0 | 0 |
0 | != 0 | 0 | != 0 | Number if denormalized |
255 | 0 | 2047 | 0 | Infinity (sign-bit defines + or -) |
255 | != 0 | 2047 | != 0 | NaN (Not a Number) |