IBM Reports Cache Memory Breakthrough

IBM this week announced a breakthrough in microprocessor design that will greatly improve the performance of processors, particularly multicore processors.

The breakthrough relates to the memory used in a Level 1, or L1 cache used in microprocessors. It’s the first level of CPU instruction cache for storing the most recent instructions.

L1 caches use static RAM, or SRAM . This type of memory does not require a refresh with new data, which is what makes it ideal for a CPU cache. It can run as fast as 10 nanoseconds.

However, it has one shortcoming: it requires up to six transistors per bit of information, whereas dynamic ram, DRAM  only requires one transistor plus a capacitor per bit. The result is that L1 caches can take up a considerable amount of space on a CPU die.

IBM’s breakthrough, announced at the International Solid State Circuits Conference (ISSCC) conference this week in San Francisco, will allow for DRAM to be used in place of SRAM on CPU caches. The new memory is called eDRAM, or embedded dynamic random access memory.

By using a manufacturing process called silicon-on-insulator, IBM was able to create extremely fast DRAM that can be used in a cache. Access time would only be 1.5 nanoseconds, or a billionth of a second. It will take up one-third of the space on the CPU as old SRAM and consume one-fifth of the power.

The technology will be a part of IBM’s forthcoming 45nm microprocessor roadmap and should become available beginning in 2008.

L1 caches tend to be very small, only around 64k to 128k in most processors, while Level 2 and Level 3 caches, which are off the CPU, can run from 256K to 4MB. That’s because of how much space the SRAM consumed in the CPU.

Dan Sokol, analyst for the Envisioneering Group, said this could reduce the need for up to three levels of cache. “If you have a large enough L1 cache, everything else is irrelevant. You probably won’t need as much L2 and L3, it depends on the processor,” he told

Sokol estimates the move to eDRAM will give IBM a space savings of almost 4 to 1. Rather than shrink the size of the CPU, IBM will fill that space with cache, which could mean a four-fold increase in the size of L1 cache. Sokol said that will yield better performance across the board because on-die cache is better than off-die.

“You want it on the same die with a 128 or 256-bit bus to get the bits back and forth as fast as possible,” he said. “There’s nothing worse than a CPU with not enough cache. If you had two identical CPUs but one had twice as much L1 cache, it would be the faster processor.”

The real beneficiary will be mutlicore processors, because the more cache you have, the more threads a processor can handle without having to unload them from the cache. It will also be important for handling the all-important cache coherence .

“Right now we’re shipping dual core chips and leaving cache coherency to software programmers. That’s what I consider ducking the issue. Cache coherency is a serious issue when you go beyond two processors. This will be a big help in addressing that,” said Sokol.

While Intel  and its arch-rival AMD  have not announced anything comparable, Sokol expects that will change. “There’s very little in this industry that one company has and can hold onto without its competitors coming up with their own work-around,” he said.

IBM is already a manufacturing partner of AMD’s.

News Around the Web