nVidia introduced an update to its Tesla line of processors that doubles the performance over the prior release.
Tesla is derived from nVidia’s high-performance graphics video cards and serves as a massive math co-processor to augment a standard x86 CPU.
While Tesla systems use a PC’s x86 processor to boot the computer and load the operating system, applications can take advantage of nVidia’s (NASDAQ: NVDA) CUDA language to send all of their mathematical functions to the Tesla chip, rather than the x86 processor.
With 120 massively parallel processing cores, Tesla offers processing of data including medical images, financial models and scientific simulations. nVidia said this new generation of chips doubles the performance over the last generation, offering up to four teraflops in a rack system, doubles the memory to 16GB and has three times the power efficiency as the last generation.
More importantly, it scales in a way Intel processors can’t, nVidia claims.
“The problem people have is if they want to double their performance, they have to double the number of servers,” Sumit Gupta, product manager of Tesla workstation products at nVidia told InternetNews.com. “The other problem is CPUs have hit a megahertz wall, so they are going to multicore, but apps aren’t scaling with multicore.”
Jim Hardwick, senior software engineer for TechniScan, which does 3D ultrasound medical imaging, knows that all too well. In a clinical study of a medical imaging system using a cluster of six Pentium M machines, it took an average of 2.5 hours to process each image, with a maximum of 4.5 hours.
This meant patients had to come back days or weeks later to get their results, which meant the inconvenience of a second visit and all the angst associated with waiting. The cluster had to run 24 hours a day to chew through the scans, creating heat and power problems.
Adding nodes to decrease the processing time helped to a point, but after a dozen nodes, the returns became negative.
“As we added nodes, time actually increased due to overhead,” Hardwick said. “Our application wasn’t able to run in parallel.”
So TechniScan added a Tesla add-in card to a PC, and processing time was cut to 45 minutes. In a system with four of the Tesla cards, the improvement was almost linear; processing time took as little as 15 minutes. This enabled doctors to perform scans and go over the results with the patient during the same visit.
“This all boils down to doing things in a single visit,” he said. “If we could get processing time down to 15-20 minutes, we could meet the requirement for a single visit. With Tesla, patients were able to get results the same day.”
Hardwick said he hopes to cut that time even further if nVidia’s promise of doubled performance holds true.
Tesla is derived from the same chip technology on nVidia graphics boards — except instead of graphics processing, Tesla does the kind of number crunching needed for imaging and simulations. It handles both integer and floating point math equally well, Gupta said, with the only difference being how each type of product uses its chip-in-board design.
Tesla is available as an add-in card for a standard desktop or tower PC, or as a 1U rack mount system. Unlike its video processing business, where it just sells chips to third-party OEMs, nVidia makes the Tesla hardware itself and has resellers, including Sun and Lenovo.
While it has massive processing power, Tesla still needs an x86 chip to boot up and do things like scheduling and management.
“This is the model of heterogeneous computing,” Gupta said. “You need at least one CPU core to run the operating system.”
nVidia didn’t put much promotion behind CUDA when it first came out, but the company is making more of an effort now. It’s targeting life sciences, medical imaging, productivity, energy and finance as potential markets.
The company will offer the S1070 rack system, which features four Tesla processors in a 1U rack, for $7,995. It will also offer the C1060 plug-in card for a PC for $1,699. Both will be available in August.