The San Diego Supercomputer Center (SDSC) at the University of California, San Diego plans to be the first site running a supercomputer using flash memory-based storage to do the heavy lifting instead of hard drive-based storage systems.
The school has been awarded a five-year, $20 million grant from the National Science Foundation (NSF) to build the computer, which it named “Gordon.” A flash-based computer named Gordon… Someone at UCSD has a sense of humor.
The computer, provided by high-performance computing vendor Appro, will consist of 32 “supernodes,” each of which consists of 32 compute nodes that provide 240 gigaflops per node and 64 gigabytes of DRAM. A supernode also incorporates 2 I/O nodes, each with 4TB of flash memory. When tied together by virtual shared memory, each supernode has the potential of 7.7 teraflops (TFs) of compute power and 10 TB of memory (2TB of DRAM and 8TB of flash memory).
When fully configured and deployed, Gordon will feature 245 teraflops of total compute power, 64TB of DRAM, 256TB of flash memory and four petabytes of disk storage. Gordon’s 32 supernodes will be interconnected via an InfiniBand network capable of 16 gigabits per second of bi-directional bandwidth. The clustering software will be ScaleMP’s vSMP virtual memory sharing software.
Gordon will be configured to achieve a ratio of addressable memory in terabytes to peak teraflops on each supernode that is greater than 1:1. For many HPC systems, that ratio is less than 1:10, the school claims.
Waiting for Sandy Bridge
Gordon won’t see operation until late 2011 or early 2012 because UCSD is waiting on future technologies. The center wants Intel’s (NASDAQ: INTC) Sandy Bridge processor, due at the end of 2010 or early 2011. “The 8 flops per clock gets us to that 200 teraflops in the most economical way possible,” explained Mike Norman, interim director of the SDSC.
They also want new controllers for flash drives that are considerably faster than the current generation. “We just couldn’t get the system with all the [performance] numbers we wanted any earlier,” said Norman. “We’re using one-quarter petabyte of flash. The high aggregate I/O rates for all the flash needs a new controller technology that won’t be there until 2010.”
Norman said that discussions with Intel gave the center confidence that the SSD drives can last. “They are trying to move into the enterprise server space and they are developing the technology they say will have the reliability. So they are one of our partners and we’re going to rely on them to deliver,” he said.
One of the things that will help preserve the drives is the usage model, which will be to load massive databases and do lots of reads. It’s writes to the disk that wear out an SSD, and Norman estimates a 10 to 1 read/write ratio.
“If you’re reading continuously, it seems like we won’t wear them out,” he said. “I think part of it is these HPC machines have a three to four year lifespan, so we think we just won’t wear them out before the end of life of the system.” Intel’s wear level technology is estimated to give the drives a seven-year lifespan.
Norman estimates the SSD drives will give ten times the performance of high speed hard drives and will give the center new opportunities for experiments.
“Where we see real opportunities is fusing databases together. We might have a genomics database in one supernode and a protein structure database in another and maybe epidemiology in a third. Then the user would cross-correlate them,” he said.
The computer is one of many to be discussed at the upcoming SC09 show in Portland, Oregon, running from November 14 to 20.