Broadcom's next-generation PCIe switches support AMD's socket-to-socket Infinity Fabric technology (also known as xGMI) – the company's standard for increasing data transfer speeds between CPUs in a system.
The Infinity Fabric interconnect, normally used in EPYC servers, can handle packet-to-packet connectivity and behaves like PCIe Gen5 for cards, as well as CXL. Now that Broadcom supports the standard, the technology will make its way to its PCIe switches. But the real secret weapon here is Ethernet, says Serve the House.
AMD has thrown its support behind the not-yet-formed Ultra Ethernet Consortium (UEC) to use this 50-year-old connectivity technology as the main connection between AI clusters, instead of Infiniband, which has been used thus far. Infiniband has always been used in most high-performance computing (HPC) situations – while Ethernet has been adopted in a more mainstream way. But lately, Ethernet has emerged as a technology that can handle high-speed data transfer in the era of data-intensive workloads and AI.
How does the Frore cooling system work?
Nvidia's main chips, including the A100 and H100 graphics cards, use their own internal NVSwitches to connect chips within a chassis, and then via an external link.
NVLink, a multi-lane short-range link that rivals PCIe, can allow a device to handle multiple links simultaneously in a mesh networking system orchestrated with a central hub. With AMD joining the UEC, it hopes it can up its game by relying on cross-industry partnerships to solve some of the technology challenges where Nvidia's NVLink has an advantage.
AMD recently released its powerful Instinct MI300 accelerator, for example – but when it comes to real-world deployment, scaling up the use of these chips and enabling high-speed communication between them is just as important as pure performance. Nvidia's NVLink, for example, can allow its implementations to scale extremely well, something AMD will look to rectify with the latest developments.