RealTime IT News

Where You See a Gym, They See a Supercomputer

SAN FRANCISCO -- University of San Francisco computer scientists turned a gymnasium into a supercomputer on the fly over the weekend in an attempt to create a supercomputer that breaks into the world's fastest.

Although the project fell short of its benchmarking goals, organizers said "FlashMob I" showed how the power of supercomputing could be made accessible to everyone. The effort was cobbled together from computers trundled from labs, classrooms, offices and homes Saturday and was the brainchild of John Witchel, a USF graduate student, and his professor, Greg Benson.

Project organizers Benson, Witchel and Pat Miller of USF hoped to link as many as 1,400 computers together using open source software to create an ad hoc supercomputer powerful enough to break into the world's Top 500.

"We needed 550 gigaflops to break into the Top 500," said USF associate professor David Galles, who hacked the standard Linpack benchmarking application to allow backing up the entire state of the machines to disk every half hour. By mid-afternoon, it became clear that the group wouldn't make the benchmark, but, he said, "we'll do some pretty impressive computation."

In the end, 669 computers were hooked into the network. The peak result was 180 gigaflops using 256 computers, but the best-completed result was 77 gigaflops using only 150 computers.

"Linpack is an incredibly hard benchmark to run," said Miller, a USF lecturer. "The least hiccup and it quits." Many runs aborted for unknown reasons, and the tests were plagued with mysterious errors that he thinks may have been due to network card problems.

If 1,000 computers had come in the door, Miller believes the project would have achieved 500 gigaflops. The mass influx of community participants the group had planned for never materialized, and when the coordinators cleared the room in order to start tuning the system, nearly half the tables remained empty.

But organizers of the event said FlashMob I was a proof of concept for a new kind of supercomputing that brings massive computing power out of academic and military citadels and into the public domain. While traditional supercomputers are composed of identical nodes that are carefully for maximum performance, the FlashMob software, based on the Linux kernel, is designed to quickly connect heterogeneous machines, enabling them to work together as one.

The idea is to let groups of like-minded individuals pull together immense computing power to work on a single problem, then disperse the machines back into their homes and offices.

Set-up for the maiden voyage began over the weekend with the installation of university-owned machines. From 8 AM to 11 on Saturday, people from the community who heard about the event through word of mouth and online bulletin boards like craigslist trooped to the campus to lend their personal computing power to the matrix.

There were laptops, desktops, Lynx boxes and CPUs pulled from racks, their guts exposed. Online mortgage brokerage eLoans contributed 104 machines; entire USF labs were stripped of computing power for the one-day event.

While the physical set-up and breakdown for FlashMob I took three days, project co-organizer John Witchel told that FlashMob computing could be up and running in a matter of hours in locations that are already networked, such as office buildings and college campuses. "When I look at a dormitory," he said, "I see a supercomputer."

Mark Gantley, director of data management engineering for HP's high performance technical computing division, said the weekend project was intended to widen the amount of computing resources available.

A supercomputer he helped build for the University of Pittsburgh is already completely subscribed for the rest of its useful life. "Linux is very important to this community because there is an insatiable demand for computing cycles -- far more demand than there are supercomputers," he said. Therefore, the use of free software and a collage of different types of machines could bring the cost down and expand the use of supercomputing.

HP was a sponsor of the project, contributing technical resources and some cash for expenditures such as hiring an electrical contractor to pull extra juice into the gym.

The event was meticulously organized. Rows of folding tables filled the gym, thick power cables precisely coiled underneath. Some 1400 individual gigabit Ethernet cables were carefully coiled, taped down to the table and labeled with a fluorescent numbered tag.

As a steady stream of people lugged their personal machines into the USF gym, they were met by one of more than 100 black t-shirted volunteers. The volunteers checked the numbered label on the computer against the number on a receipt, gave them a CD with the open-source FlashMob software, and handed them off to another volunteer who ushered them to the next vacant spot on a table.

Individual machines were connected to four Foundry Networks switches, on loan from the company, which also provided staffers. The switches themselves were linked with fiber optic cables. A third corporate sponsor, Myricom, provided interconnect software and technical help.

Benson, Witchel and Miller were unperturbed at working with just half the CPUs they had planned for. The benchmark was a discrete and tangible goal, they said, but the real payoff was the launch of a sort of open source social experiment. They said FlashMob I will begin to teach others how to organize the physical set-up, how much electricity they'll need, how to get people to show up with their computers.

Those who undertake the next FlashMob Supercomputer will build on this work, add their own code and pass it along with the wisdom they've gained. Already, other universities and groups around the world are requesting the software so they can build their own grassroots supercomputers.

As they were plugged in, batches of machines were tested to see if they met the minimum speed. If they didn't, a volunteer attempted to find a machine in the group that was dragging the speed down, because, often, a supercomputer runs only as fast as the slowest node. There was no time for fancy diagnostics or troubleshooting. "If there are fluctuations in one batch, we just won't use it," said co-organizer Benson, an assistant computer science professor at USF. "Today is about bringing stuff up, learning what we can to use in the future."