Scientists can get huge discounts on renting Nvidia’s A100 GPUs for AI training – but that won’t last long

Users of the National Energy Research Scientific Computing Center (NERSC) can run AI tasks on the organization’s Perlmutter supercomputer for half price this month.

Amid a lack of global availability of computing power for AI workloads, the facility – which operates on behalf of the U.S. Department of Energy’s Office of Science – is changing the equation.

Between September 7 and October 1, those registered with the organization pay half the normal rate. For example, a three-hour job that normally runs on seven nodes would be charged for 21 GPU node hours, but in September it would be charged for 10.5 GPU node hours.

Perlmutter’s A100 GPUs

“Using your time benefits the entire NERSC community and spreads demand more evenly throughout the year. To encourage usage now, starting tomorrow and through the end of September, we’re offering a 50% discount on all tasks running on Perlmutter GPU nodes.” wrote User Engagement Group Leader, Rebecca Hartman-Baker.

Hartman-Baker also pointed to additional assistance that NERSC will provide to users. This can be useful for those who have poor performance and need help making sure their script is completely up to date, or simply for those who want to try out code but aren’t sure where to start, among other possible applications.

Founded in 2021, Perlmutter is an HPE Cray EX supercomputer that uses both AMD Zen 3 Epyc CPUs and Nvidia A100 Tesla Core GPUs. In the first phase of development, the machine was equipped with 1,536 GPU-accelerated AMD CPU nodes, each with four A100 GPUs, supplemented with 35 PB of all-flash Luster-based storage. In the second phase, the supercomputer was expanded with 3,072 CPU-only nodes, each with two AMD Epyc processors and 512 GB of memory.

The supercomputer itself is largely used for nuclear fusion simulations, climate projections and materials and biological research. The first workloads run on Perlmutter included a project to discover how atomic interactions worked – which could lead to better batteries and biofuels.

GPU capacity to run AI workloads is difficult to come by, and the offer unfortunately only applies to members of NERSC. It was originally spotted by a Microsoft high-performance computing (HPC) specialist Glenn Lockwoodwho pointed out that NERSC “could make a killing” by supplementing idle capacity with commercial workloads.

This would be especially true during the summer months when academics are largely absent. However, there are alternative ways to rent GPUs, including through Akash’s decentralized Supercloud for AI network.

More from TechRadar Pro

Related Post