2. Memory and Addresses
Before we begin:
- STORE COPIES FROM THE CPU TO MEMORY
- LOAD COPIES FROM MEMORY TO CPU
Ok, thus far we have been putting numbers into registers with li a0, 3
and COPYing register contents move a0, t0
.
Syscall
When printing something, you are interacting with the hardware which does the printing for us.
So what is the issue with registers? Well, there is only 32 of them and that is not enough to run programs. Solution: memory. The system memory is a piece of temporary storage hardware, it is smaller, faster, and more expensive than persistent storage. It is where the programs and data that the computer is currently executing and using reside: all the variables, all the functions, all the open files, etc. The CPU can only run programs from system memory! Essentially, the program has access to it, but not necessarily using it.
The memory is a big one-dimensional array of bytes. Every byte value has an address which is its array index. Addresses start at, SHOCKER, 0. When each byte has its own address, we call it a byte-addressable machine.
If each address refers to one byte and if your addresses are $n$ bits long... how many bytes can your memory have? $2^n$ B. So with 32-bit addresses can acesses $2^32$ B = $4$ GiB of memory.
For most things, we want to use words since it is the more comfortable integer size for the CPU. One word is $32$b or $4$B. Wait! Our memory only holds bytes, well combine multiple bytes into larger values (CPU does it for us and the data is still just bytes). When we talk about values bigger than a byte, the address is teh address of their first byte.
MIPS is a register-register architecture (load-store), so only special instructions can access memory. All memory accesses are done with two kinds of instructions: loads and stores.
LOADS copy data FROM memory INTO CPU registers.
STORES copy data FROM CPU registers INTO memory.
Danger
When we talk about loads and stores we are only talking about instructions that access memory.
Everything in memory has an address. Also, every piece of data has two parts: an address and a value.
MIPS
To declare a global variable:
So we start with the directive .data
for the assembly. Then the name of the variable, colon, type, initial value. This is the equivalent to saying static int x = 4;
. Then after declaring as many as you want, you must write .text
.
To load the address of a variable:
This will store the 32-bit address into t0
, this is similar to c with &x
.
To actually load and store the 32-bit words use lw and sw.
Indirect addressing with offset:
Indirect: The memory address is the value in a register.
Offset: You add a constant to that value.
$Effective \space address = value \space of \space register_address + offset$
So comparing to C and asm.
Or you know, just use the short cut.
Smaller Values
What if word is too big for me? well good thing there is half (2 bytes) and byte (1 byte).
Wait how do we load them!?!
Wow that was hard! Wait... What happens to those extra bits???
Well, sometimes we need to widen a number with fewer bits to more. Zero extension is easy: put 0s at the beginning.
$1001_2 \rightarrow \text{to 8 bits} \rightarraow 00001001_2$
Wait, signed numbers exist. Sign extension puts copies of the sign bit at the beginning.
$1001_2 \rightarrow \text{to 8 bits} \rightarraow 11111001_2$
$0010_2 \rightarrow \text{to 8 bits} \rightarraow 00000010_2$
When you do lb
it does sign extension, lbu
does zero extension. However, the CPU does not know which is which, so you have to do the right instruction.
When going the other way, the upper part of the value is cut off or trucated. The sign issue doesn't exist when storing because we are going from a larger number of bits to a smaller number. So there is only sh
and no sbu
or shu
.
Endianness
Let's say there's a word at address 4 made of 4 bytes. How is it represented?
This is the idea of endianness: the rule used to decide what order to put the bytes in memory.
Note
It has nothing to do with the value of the bytes, only the order you store them in memory.
Example
0 1 2 3
DEC0EFBE
Little-endian means "you start with the LITTLE end": 0xBEEFC0DE
Big-endian means "you start with the BIG end": 0xDEC0EFEBE
Does it matter? No as long as you're consistent, however, most ocmputers use little-endian.
Endianness does not affect:
- the arrangment of the bits within a byte.
- 1-byte values, arrays of bytes, ASCII strings.
- the ordering of bytes inside the CPU.
Endianness only affects moving/splitting data.