James in Tech

Google warns Nvidia with presentation of Trillium, its rival AI chip, while promising to release H200 Tensor Core GPUs within days

Trillium offers 4x training boost and 3x consequence improvement over TPU v5e
Improved HBM and ICI bandwidth for LLM support
Scalable up to 256 chips per pod, ideal for extensive AI tasks

Google Cloud has launched its latest TPU, Trillium, the sixth generation model in its custom AI chip family, designed to power advanced AI workloads.

First announced in May 2024, Trillium is designed to perform large-scale training, tuning, and inference with improved performance and cost-efficiency.

The release is part of Google Cloud’s AI Hypercomputing infrastructure, which integrates TPUs, GPUs and CPUs in addition to open software to meet the increasing demands of generative AI.

A3 Ultra VMs are arriving soon

Trillium promises significant improvements over its predecessor, TPU v5e, with a fourfold improvement in training performance and an up to threefold increase in inference throughput. Trillium delivers twice the HBM capacity and doubled Interchip Interconnect (ICI) bandwidth, making it particularly suitable for large language models such as Gemma 2 and Llama, as well as for compute-intensive inference applications, including diffusion models such as Stable Diffusion XL.

Google is also keen to highlight Trillium’s emphasis on energy efficiency, with a claimed 67% increase compared to previous generations.

Google says its new TPU has shown significantly improved performance in benchmark tests, delivering four times faster training speeds for models like the Gemma 2-27b and Llama2-70B. For inference tasks, Trillium achieved three times greater throughput than TPU v5e, particularly excelling in models that require extensive compute resources.

According to Google, scaling is another strong point of Trillium. The TPU can connect up to 256 chips in a single high-bandwidth pod, expandable to thousands of chips within Google’s Jupiter data center network, providing near-linear scalability for extensive AI training tasks. With Multislice software, Trillium maintains consistent performance across hundreds of pods.

In connection with the arrival of Trillium, Google also announced the A3 Ultra VMs with Nvidia H200 Tensor Core GPUs. Scheduled for a preview this month, they will offer Google Cloud customers a powerful GPU option within the tech giant’s AI infrastructure.

Trillium TPU, built to power the future of AI – YouTube

Look

A3 Ultra VMs are arriving soon

You might like it too