2. Memory and Addresses

Before we begin:

STORE COPIES FROM THE CPU TO MEMORY
LOAD COPIES FROM MEMORY TO CPU

Ok, thus far we have been putting numbers into registers with li a0, 3 and COPYing register contents move a0, t0.

Alt text

Syscall

When printing something, you are interacting with the hardware which does the printing for us.

So what is the issue with registers? Well, there is only 32 of them and that is not enough to run programs. Solution: memory. The system memory is a piece of temporary storage hardware, it is smaller, faster, and more expensive than persistent storage. It is where the programs and data that the computer is currently executing and using reside: all the variables, all the functions, all the open files, etc. The CPU can only run programs from system memory! Essentially, the program has access to it, but not necessarily using it.

The memory is a big one-dimensional array of bytes. Every byte value has an address which is its array index. Addresses start at, SHOCKER, 0. When each byte has its own address, we call it a byte-addressable machine.

If each address refers to one byte and if your addresses are $n$ bits long... how many bytes can your memory have? $2^n$ B. So with 32-bit addresses can acesses $2^32$ B = $4$ GiB of memory.

For most things, we want to use words since it is the more comfortable integer size for the CPU. One word is $32$b or $4$B. Wait! Our memory only holds bytes, well combine multiple bytes into larger values (CPU does it for us and the data is still just bytes). When we talk about values bigger than a byte, the address is teh address of their first byte.

MIPS is a register-register architecture (load-store), so only special instructions can access memory. All memory accesses are done with two kinds of instructions: loads and stores.

Alt text

LOADS copy data FROM memory INTO CPU registers.
STORES copy data FROM CPU registers INTO memory.

Danger

When we talk about loads and stores we are only talking about instructions that access memory.

Everything in memory has an address. Also, every piece of data has two parts: an address and a value.

MIPS

To declare a global variable:

.data
    x: .word 4

So we start with the directive .data for the assembly. Then the name of the variable, colon, type, initial value. This is the equivalent to saying static int x = 4;. Then after declaring as many as you want, you must write .text.

To load the address of a variable:

la t0, x

This will store the 32-bit address into t0, this is similar to c with &x.

To actually load and store the 32-bit words use lw and sw.

lw/sw register_data, offset(register_address)

Indirect addressing with offset:
Indirect: The memory address is the value in a register.
Offset: You add a constant to that value.

$Effective \space address = value \space of \space register_address + offset$

So comparing to C and asm.

static int x = 0xDEC0EFBE;
t0 = &x;
s0 = *t0;
*t0 = s0;

x: .word 0xDEC0EFBE
# ...
la t0, x
lw s0, 0(t0)
sw s0, 0(t0)

Or you know, just use the short cut.

static int x = 0xDEC0EFBE;
s0 = x;
x = s0;

x: .word 0xDEC0EFBE
# ...
lw s0, x
sw s0, x

To Increment a Variable

lw t0, x
add t0, t0, 1
sw t0, x

Smaller Values

What if word is too big for me? well good thing there is half (2 bytes) and byte (1 byte).

.data
    x:          .word 4 # => 0x00000004
    small:  .half 4 # => 0x0004
    tiny:   .byte 4 # => 0x04

Wait how do we load them!?!

lh t0, small
sh t0, small

lb t0, tiny
sb t0, tiny

Wow that was hard! Wait... What happens to those extra bits???

Well, sometimes we need to widen a number with fewer bits to more. Zero extension is easy: put 0s at the beginning.

$1001_2 \rightarrow \text{to 8 bits} \rightarraow 00001001_2$

Wait, signed numbers exist. Sign extension puts copies of the sign bit at the beginning.

$1001_2 \rightarrow \text{to 8 bits} \rightarraow 11111001_2$
$0010_2 \rightarrow \text{to 8 bits} \rightarraow 00000010_2$

When you do lb it does sign extension, lbu does zero extension. However, the CPU does not know which is which, so you have to do the right instruction.

When going the other way, the upper part of the value is cut off or trucated. The sign issue doesn't exist when storing because we are going from a larger number of bits to a smaller number. So there is only sh and no sbu or shu.

Alt text

Endianness

Let's say there's a word at address 4 made of 4 bytes. How is it represented?

Alt text

This is the idea of endianness: the rule used to decide what order to put the bytes in memory.

Note

It has nothing to do with the value of the bytes, only the order you store them in memory.

Example

0 1 2 3
DEC0EFBE

Little-endian means "you start with the LITTLE end": 0xBEEFC0DE

Big-endian means "you start with the BIG end": 0xDEC0EFEBE

Does it matter? No as long as you're consistent, however, most ocmputers use little-endian.

Endianness does not affect:

the arrangment of the bits within a byte.
1-byte values, arrays of bytes, ASCII strings.
the ordering of bytes inside the CPU.

Endianness only affects moving/splitting data.

Next Page →