Skip to content

7. Fractions and Floating Point

💀💀💀, yeah these are not fun, but we need them in the real world.

Fractional Numbers

$2024$

Ok, what about read numbers

$2024.320$

Ah there we go, now let's make our lives hard.

Recall: $2024.320 = 2\cdot 10^3+0\cdot 10^2+2\cdot 10^1+4\cdot 10^1+3\cdot 10^{-1}+2\cdot 10^{-2}+2\cdot 10^{-3}$

Now lets do the same with binary:

alt text

$0110.1101 = 0\cdot 2^3 + 1\cdot 2^2 + 1\cdot 2^1 + 0 \cdot 2^0 + 1\cdot 2^{-1} + 1\cdot 2^{-2} + 0\cdot 2^{-3} + 1\cdot 2^{-4}$

And to convert from decimal to binary:

alt text

So, uuhhh, what about 0.1? Well, we can't represent it without an infinite number of bits:

alt text

alt text

Also, where do we put the decimal point? Well, we can have accuracy or range or....

We just move the point, oh floating point!

Floating Point

Remember normalized scientific notation? $-0.0039=-3.9\cdot 10^{-3}$.

This makes representing floating points much easier.

alt text

alt text

alt text

IEEE 754

alt text

This is the IEEE 754 standard, and although it is a great stardard, it has some tradeoffs. Like it is not communcative, take a (F)eather (small value) and an (E)lephant (huge value). If you do, E-E+F you will get F, since you are jumping from E to 0 then adding the F. If you do, E+F-E, you will jump to E, then E+F (which is really just E), then you will get 0. You lose the ability to differentiate between small and big values.

When jumping values, you scale by a factor of two, aka you lose some precision.

0 
|-|--|----|--------|----------------|--------------------------------|

So, just never use floating point unless you know what you are doing.

Now for IEEE 754 for itself, it is in sign-magnitude and the significand is 23 bits because of that (arithmetic HARD).

The exponent is... interesting:

We use biased notation which is $\text{biased representation}=\text{exponent}+\text{bias constant}$. In this case, the bias constant is 127

alt text

This gives use values from -126 to 127 (1 to 254 biased), note: 0 and 255 are reserved!

An example to decode the value:

alt text

An example to encode the value:

alt text

And to add values:

alt text

There is also double-precision and half-precision:

alt text

alt text

Now for those reserved values:

Single precision Double Precision Meaning
Exponent Fraction Exponent Fraction
0 0 0 0 0
0 != 0 0 != 0 Number if denormalized
255 0 2047 0 Infinity (sign-bit defines + or -)
255 != 0 2047 != 0 NaN (Not a Number)

Next Page →