..:: Sandy Bridge Microarchitecture Continued… ::..
Yet another improvement with Sandy Bridge is the addition of the ring interconnect for the processor cores, graphics and other components to the L3 cache. With the ring interconnect, Intel is taking a page out of their server microarchitecture strategy. There are separate rings for data, request, acknowledge and snoop. The rings are all able to handle 32 bytes of data per clock cycle. As you can imagine, by increasing either core count or the cache size, the bandwidth scales up as well. With this new ring interconnect, latency to the L3 cache is reduced improving performance.
The L3 cache is partitioned into four, 2MB banks. Each bank has it’s own arbitration and a full cache pipeline. For dual core and low end products, some of these banks are disabled to produce the required cache size. In the full quad core design, each processor core has access to the full cache, although each has it’s own associated L3 cache bank as well. An important item to note is that the graphics core also now has direct access to system memory through the L3 cache. In prior designs, it was a direct connection to system memory which hampered performance, not to mention the fact the memory controller was moved to the GPU die versus the processor die.
And last, but not least, we have the System Agent. As you can see from the slide above, this essentially is all the functionality that once existed in the Northbridge. The PCI Express controller, memory controller, DMI, display engine, power control and more. You may be more familiar with the “Uncore” nomenclature used by Intel for previous generation processors. The key difference here is that the L3 cache no longer belongs to the System Agent. The L3 cache now operates at the full operating frequency separate of these other entities.