Sandia National Laboratories’ new El Dorado supercomputer ranks 20th in the world on the latest Top500 list, which was released at the 2024 Supercomputing Conference in Atlanta. The machine is smaller in scale but architecturally identical to Lawrence Livermore National Laboratory’s El Capitan supercomputer, which ranked as the fastest in the world in the same survey.
“We at Sandia have invested in preparing many of our engineering and science codes to run effectively on El Capitan and El Dorado,” said Andrew Younge, a Sandia supercomputing manager. “I am looking forward to taking advantage of this massive new capability with El Capitan and enabling a much higher level of fidelity in our simulations.”
According to Younge, the El Dorado/El Capitan system represents the first leadership-class exascale system designed to support the National Nuclear Security Administration’s stockpile stewardship missions. He described El Dorado’s specific functions as an application-readiness test system.
“Basically, it is an extra-large on-ramp for Sandia computing codes to build, test, prepare, validate and update, all at Sandia, before running at exascale on El Capitan,” said Younge.
“Because El Dorado is actually quite large compared to normal application readiness systems, we anticipate it will provide production cycles to Sandia as well,” he continued. “As a limited side function, El Dorado is also likely to enable Sandia to do more experimental R&D on the high-performance computing system itself, perhaps exploring new workflows or similar avenues in the future.”
“Part of the magic — I call it the ‘special sauce’ — of the system is the use of the Cray-developed proprietary high-speed network called Slingshot,” said Sandia computing manager Kevin Stroup. Another plus, he said, is that the compute nodes are direct liquid cooled, meaning that coolant fluid is piped through them and removes the heat generated using fluid-filled heat sinks. “Without this, it would be nearly impossible to operate the system and deal with the heat produced.”
The machine, designed and built by Hewlett Packard Enterprise, is an EX4000 model based on the company’s Cray product line. It consists of three cabinets of compute blades, with a total of 384 MI-300A nodes. The MI-300A is an accelerated processing unit from AMD — “sort of a CPU+GPU put together,” Stroup said. The El Dorado supercomputer represents another major deliverable by NNSA’s Advanced Simulation and Computing program to the nuclear security enterprise.