James in Tech

‘No One Knows Yet’: Donut Design Could Create a Computing Monster With a Quadrillion Transistors – Analysts Discuss Unusual Interconnection as Cerebras CEO Admits We Don’t Know What Happens When Multiple WSEs Are Connected Together

Tri-Labs (comprised of three major US research institutions – the Lawrence Livermore National Laboratory (LLNL), Sandia National Laboratories (SNL) and Los Alamos National Laboratory (LANL)) has teamed up with AI company Brain on a number of scientific problems, including breaking the time scale barrier in the field of molecular dynamics (MD).

There is an article explaining this specific challenge that you can read herebut essentially it refers to the problem of performing molecular dynamics simulations on a larger time scale than would normally be possible.

The barriers here are twofold: computing power and communication latency between different nodes of an HPC system. Traditionally, to compensate for the lack of computing power, scientists assign more work to each node and scale up the simulation size with the number of nodes. Unfortunately, the slow communication between nodes caused by high latency further exacerbates the timescale problem.

Like a donut

MD simulations are crucial for several scientific fields because they bridge the gap between quantum electronic methods and continuum mechanical methods. However, these simulations face timescale limitations because they must take into account atomic vibrations, which occur over very short timescales, and other phenomena that occur over much longer periods.

The authors of the paper attempted to overcome the timescale barrier by using a more efficient computing system, specifically Cerebras’ Wafer-Scale Engine.

If The next platform explains: “The specific simulation consisted of beaming radiation into three different crystal lattices made of tungsten, copper and tantalum. In these specific simulations, using 801,792 atoms in each lattice, the idea is to bombard the lattices with radiation and see what happens.”

By running the simulations on Frontier, the world’s fastest supercomputer based at Oak Ridge National Laboratory in Tennessee, and on Quartz at LLNL, scientists were able to witness for just nanoseconds what happened to the lattices when they were bombarded with radiation. With the help of WSE, they were given tens of milliseconds to see what happened.

For the tests, Tri-Labs used Cerebras Wafer Scale Engine 2 (WSE-2), instead of the newer and more powerful WSE-3 launched earlier this year, but as described above the results were impressive. As the article reports: “By allocating a processor core for each simulated atom, we demonstrate a 179-fold improvement in time steps per second compared to the Frontier GPU-based Exascale platform, along with a large improvement in time steps per unit energy. Reducing the runtime from each year to two days unlocks currently inaccessible timescales of slow microstructure transformation processes that are critical for understanding material behavior and function.”

The next platformTimothy Prickett Morgan asked Andrew Feldman, CEO and co-founder of Cerebras, what happens when you connect multiple wafer-scale engines together and try to run the same simulation, and he was told that “nobody knows yet.”

Prickett Morgan further noted, “The proprietary interconnect in the WSE-2 systems could be scaled to 192 devices, and with the WSE-3 that number was increased by more than an order of magnitude to 2,048 devices,” but he “ suspects that the same scaling principles apply to WSEs as to GPUs and CPUs.”

However, he suggested that there might be a way to physically tie WSEs together, creating a “stovepipe of squares of interconnected WSEs,” potentially creating a donut design with power on the inside and cooling on the outside. As Prickett Morgan concludes, “This kind of configuration can’t be worse than using InfiniBand or Ethernet to connect CPUs or GPUs.”

Like a donut

More from Ny Breaking