ALBUQUERQUE, N.M. — Supercomputers have sprung up across the world landscape like the statues on Easter Island — separate, huge, and impenetrable to the average person. They perform hundreds of trillion calculations per second, a figure almost ungraspable by a species that may have entered mathematics by first counting on its fingers.
But new work on Sandia National Laboratories’ Red Storm supercomputer — the 17th fastest in the world — is helping to make supercomputers more accessible, in effect removing them from the solitary confinement of their specialized operating systems.
Sandia researchers, working hand in hand with researchers from Northwestern University and the University of New Mexico, socialized 4,096 of Red Storm’s total 12,960 computer nodes into accepting a virtual external operating system — a leap of at least two orders of magnitude over previous such efforts.
“The goal is to create a more flexible environment for all users,” said Sandia researcher Kevin Pedretti, who led Sandia researchers in adapting and optimizing a Northwestern program called Palacios for the Red Storm environment. Sandia researchers directed the testing effort.
Built by Sandia as part of the National Nuclear Security Administration’s (NNSA) program to ensure the safety, security and effectiveness of the nation’s nuclear stockpile without testing, Red Storm’s advanced computational capabilities are also being utilized in unclassified modes to contribute to global efforts to combat climate change, evaluate dangers from possible asteroid strikes, and help solve other problems of national interest.
Peter Dinda, professor of electrical engineering and computer science at Northwestern’s McCormick School of Engineering, added, “If we can virtualize supercomputers without performance compromises we will make them easier to use and easier to manage, generally increasing the utility of these very large national infrastructure investments.” Dinda led the development of Palacios with his student Jack Lange.
Because of the complex nature of the classified work performed on Red Storm in the service of stockpile stewardship, its operating system is functionally restrictive compared with a general-purpose operating system.
Enter the technique called virtualization. A virtual machine in effect separates the hardware of a computer from its operating system.
“Our observation is that no single operating system will satisfy the needs of all potential users,” said Pedretti, “so we are attempting to leverage the virtualization hardware in modern processors to allow users to select the operating system best for them to use at run-time.”
This could permit one machine to simultaneously run multiple operating systems, with the possibility of migrating these systems from one computer to another. To achieve this trick on Red Storm, a receptor operating system called Kitten has been developed primarily at Sandia, while a virtual machine monitoring program called Palacios was developed at Northwestern. Operating through the filter of this programming translation, a program not native to Red Storm can run on nodes of the machine
The overlaid program was only 5 percent less effective than running Red Storm’s native, fixed programming. That figure, called overhead, represents the additional expense in time and efficiency of running the program in a virtualized environment.
“We believe the results show that the benefits of virtualization can be brought to even the largest computers in the world without performance compromises,” said Pedretti.
This would mean that researchers around the world should one day be able to run their own simulations on huge machines at remote sites without having to reconfigure their software to the machine’s specific hardware and software environment.
“Visualization technology provides a path for supporting a broader range of supercomputer applications, both for traditional scientific computing and for national security purposes,” says Pedretti.
The virtualization market in general is reported by industry magazines to be billions of dollars.
The work was funded for Sandia by its Laboratory Directed Research and Development program. Northwestern and UNM work was funded by the National Science Foundation.