Cache Memory Notes

What a cache memory?

DRAM performance improve lot slower compared to CPU performance, as a result DRAM and CPU performance gap keep on increasing. Chances are that your CPU is already done it’s job and is waiting (idling) for data from DRAM to continue further, here the DRAM speed is bottelnecking CPU speed. To solve this, we put smaller but faster (also more costy) memory on CPU. Chunk(s) of memory (DRAM) is loaded into this, so that CPU can immediately access it. This memory is know as cache memory.

Memory hierarchy looks something like this:

                          |  DRAM
    Size and              |  L3 (shared between processes)
         latency decrease |  L2
                          v  L1 (Data), L1 (Instruction)

Hierarchy is showen as formality, we don’t have to keep levels of cache is mind when programming

Associativity

Associativity of cache is how copy of main-memory is mapped to cache. You can think of cache as an N-Byte 3D array, though the 2D array is laid linearly one after another instead of stack. This is how it look like:

{ // 3
    { // 2
        { a, b, c, d, e, f, g, h }, // 1
        { i, j, k, l, m, n, o, p }
    },
    {
        { q, r, s, t, u, v, w, x },
        { y, z, !, @, #, $, %, ^ }
    }
}

This is cache line (aka block), here each bytes are contiguous. It’s size can be X-byte long, here it’s 8 bytes.
A Group of cache lines is called a set. There can be Y number of sets.
Cache consists of Y sets * X bytes.

Fully associative cache

Here the memory of cache’s block size is placed anywhere in the cache (This means there 0 set). For example, if you have memory like:

{ a, b, c, d,  e, f, g, h,  i, j, k, l,  m, n, o, p,  q, r, s, t,  u, v, w, x }

Each alphabet represent 1-byte, so there are 24 bytes.

If you have fully associative cache memory with 4 cache block and of size 4-bytes, then your cache memory might be like:

{
    a, b, c, d,
    i, j, k, l,
    , , , ,     // Empty cache block
    u, v, w, x
}

That is, the cache block is placed randomly (well not randomly, there are different types of policies for eviction and replacement for these cache block).

Pro:
1. You can fit large amount of data from anywhere in memory into cache.
Con:
1. You will have to linearly search the whole cache memory to find the desired block.

Direct-mapped cache

TBD

Set-associative cache

TBD

Agenda: