Main memory is a computer device with the slowest access rate. If the CPU needs a data item, a request is sent to the main memory via a memory bus. The main memory then searches for the data item and sends it back to the CPU. Lot of time is wasted in this entire cycle. What if the data item were stored somewhere close to the CPU? The working of processor cache is based on a similar concept. To understand the concept of cache memory, we will take an example of a library throughout this article.
Suppose, we have a library, with a single librarian. If a person comes and asks for Harry Potter Part I, then the librarian goes to the bookshelf and retrieves the book and gives it to the person. When the person is done with the book, it is restored to the shelf. If, any other person comes and asks for the same book, the cycle is repeated again. This is exactly how a system works without a cache memory.
Why do we need processor cache?
Now, let's see what happens in the presence of a cache memory. In our library example, let's consider a drawer at the librarian's desk as a cache. The procedure remains the same, when the first person places a request for a book. But, when the book is returned, the librarian does not store it on the shelf, instead keeps it in her drawer. Now, when the next person comes and places a request for the same book, the librarian simply has to retrieve it from her drawer. In a similar way, cache memory stores the data items that are frequently required by the processor. Thus, every time, the data is requested, processor simply looks in the cache and retrieves it, saving a long trip to the main memory. This tremendously increases the processor speed.
Does cache memory store only the frequently used data items?
No, cache memory is a smart piece of memory that also looks for the data that is likely to be requested in the near future. Continuing with our library example; when the person requests for Harry Potter Part I, our intelligent librarian also fetches Harry Potter Part II along with it. Now, when the person is done with the first book, it is very likely that he might ask for the second part. And when he does, librarian has it ready in her drawer. Similarly, when the cache memory fetches data items from the main memory, it also fetches the items that are located at the addresses near the requested items. These adjacently located chunks of data which are transferred to the cache is called the cache line.
Two-level processor cache
Most hard drives and other components make use of a single-level cache. But a processor cache is a two-level cache, in which level 1 cache (L1) is smaller and faster; while level 2 cache (L2) is slightly slower, but anytime faster than the main memory. L1 cache is divided into two parts viz., instruction cache and data cache. Instruction cache stores the set of instructions that are required by the CPU for computing; while the data cache stores the values that are required for current execution. L2 cache is responsible for loading the data from the main memory. Again, coming back to our library example, consider a librarian's drawer as L1 cache. On any busy day, when the demand for books is high and the librarian has already stored many books in her drawer, chances are that it might get full pretty quickly. This is where L2 cache comes into the picture. Consider a bookcase near the librarian's desk as L2 cache. When the drawer fills up, the librarian starts storing the books in the bookcase. Now, whenever there is a demand for some popular book, the librarian first looks in her drawer; if the book is not found there, she searches it in the bookcase. Similarly, when L1 cache is full, data is stored in L2 cache. The processor first looks for the data in L1 cache, and if it is not found, then only L2 is searched. If the data is not found in L2 as well, a trip to main memory is inevitable.
Is implementing more cache a good idea?
Yes and No. Implementing more cache will let you fetch data quickly, only in the cases, when the data is available in either L1 or L2. Coming back to our library example. If a person requests a popular book, which is not stored in the librarian's drawer or the bookcase; she first looks for it in the drawer and then in the bookcase. This way, lot of time is wasted before she finally retrieves it from the bookshelf. Similarly, the processor checks first in L1 and then, in L2, and when the item is not found in either cache, then only sends a request to the main memory. As you must have realized, lot of processor time is wasted, in looking for the item in the two cache memories. When the processor finds the required data item in any of the cache memories, 'cache hit' is said to have occurred; on other occasions, a 'cache miss' takes place. Data items are periodically updated and replaced using various algorithms to maximize the instances of cache hit.
One might think that if cache memory is so fast, why not implement it large enough so as to store entire data of the main memory in it? The reason, is that, although cache memory offers fast access, the speed comes at a great expense. Hence, proper utilization of the available cache memory is must.