I’m a little bit confused about the purpose of the L2 cache on modern processors, given the existence of the L1 and L3 caches. I understand the general idea behind caching, but I don’t understand why the caches are set up the way they are.
I began thinking about this because of http://www.tomshardware.com/reviews/athlon-l3-cache,2416-2.html. When you get to the bottom of the review, you will notice that most of the newest CPUs have an L1 and L2 cache for each core and then an L3 cache for all the cores on the processor.
It makes perfect sense to me to have 1 cache for each core and then another cache for all the cores. I assume that the L3 cache has much higher latency.
What I don’t understand is why there’s 2 caches for each core. Is the L2 cache really that much slower than the L1 cache? It was my understanding that the CPU does some kind of sequential search of the cache (or possibly just PARTS of the cache; I’m not sure exactly) to see whether the data is there. How is it faster to search a small L1 cache and then a larger L2 cache than to just combine them into 1 cache? It seems like the latency of 1 cache would be less than the latency of going through 2 caches whose combined size is equal.
I assume that the L1 cache must be much faster than the L2 cache. Otherwise, Intel and AMD wouldn’t set it up this way. The only thing about the L1 cache that seems like it would be faster is that it’s smaller. It just seems like the extra speed from the relatively rare hits in the L1 cache would be balanced out by the double caching when there’s an L1 miss, whether there’s an L2 hit or not.
This doesn’t affect any programming I’m doing. I just wish I could understand.