12. Extras
Microprogramming
Microprogramming is when the control is an FSM whose transition table is in a special memory. The memory that contains the microcode: the small "programs" that encode the sequence of steps for each instruction.
The ROM is read-only memory which the real power comes when we make it possible to reprogram the microcode ROM. It can be made into EEPROM or flash. It's called firmware which is software tahat serves very important function and is hard to change.
Microcoded control is abstract and flexible, but at the cost of speed.
Non-microcoded control can be very fast.
Caching
Memory has a hierarchy, at the top, we have faster, denser, expensive memory (registers, L1 cache, L2 cache, DRAM), then it goes down to cheaper, slower, larger memory (local disk, distributed storage, tape).
Caching a necessary for the untility fo computers. In order to actually use these fast CPUs, we need to improve the apparent speed.
The way to think about it is reading and a desk, your desk is like the registers, your bookshelf is like the cache, RAM is the library, and boxes in a storage unit is like the local disk.
When data is requested, the goal is to read a word into a CPU register. The CPU first contacts the cache and asks if it has a copy, if it doesn't you go to the next component like RAM.
RAM copies the value to cache, and the cahce copies the value into the register.
When the CPU requests memory in an empty cache, the data won't be available. If we don't get a cache hit, then it will be expensive, since we have to write to the cache and load the memory into registers (compulsory miss). If we do get a cache hit, then it will be really fast! When the cache fills up, we can remove things based on a couple things, like time, oldest, or size.
Superscalar CPU
The best we can do is 1 instruction per cycle. Well, no.
A scalar CPU can finish at most 1 instruction per cycle.
A superscalar CPU finishes more than 1 instruction per cycle. Remember getting more hardware? well do it!! Now we can load many instructions at the same time.
Out-of-order CPUs
Instruction streams are inherently linear, but the computations they describe might not be! So the CPU can make "dependencies" faster by running them in any order that would be the most optimal for the runtime.