Cloudflare has announced the deployment of its 12th generation servers, powered by AMD EPYC 9684X Genoa-X processors, delivering improved performance and efficiency across its infrastructure.
The new processor has 96 cores, 192 threads and a massive 1152MB L3 cache – three times as much as AMD’s standard Genoa processors.
This substantial cache boost helps reduce latency and improve performance in data-intensive applications, with Cloudflare saying Genoa-X delivers a 22.5% improvement over other AMD EPYC models.
Updated AI developer products
According to the cloud provider, the new Gen 12 servers can handle up to 145% more requests per second (RPS) and offer a 63% increase in energy efficiency compared to the previous Gen 11 models. The updated thermal-mechanical design and expanded GPU support provide enhanced capabilities for AI and machine learning workloads.
The new servers are equipped with 384 GB of DDR5-4800 memory across 12 channels, 16 TB of NVMe storage and dual 25 GbE network connectivity. This configuration enables Cloudflare to support higher memory throughput and faster storage access, optimizing performance for a range of compute-intensive tasks. Additionally, each server is powered by dual 800W titanium-grade power supplies, ensuring greater energy efficiency in the global data centers.
Cloudflare would like to emphasize that these improvements are not just about brute force, but also about delivering more efficient performance. The company says the move from a 1U to a 2U form factor, along with an improved airflow design, reduced fan power consumption by 150W, contributing to the server’s overall efficiency gains. The power consumption of the Gen 12 server is 600 W under normal operating conditions, a notable increase over the 400 W of the Gen 11 server, but justified by the significant performance improvements.
The new generation also includes enhanced security features with hardware root of trust (HRoT) and Data Center Secure Control Module (DC-SCM 2.0) integration. This arrangement ensures boot firmware integrity and modular security, protects against firmware attacks and reduces vulnerabilities.
The Gen 12 servers are designed with GPU scalability in mind, supporting up to two PCIe add-in cards for AI inference and other specialized workloads. This design allows Cloudflare to strategically deploy GPUs to minimize latency in regions with high demand for AI processing. Looking ahead, Cloudflare says it has begun testing fifth-generation AMD EPYC “Turin” CPUs for its future Gen 13 servers.
Additionally, Cloudflare has introduced major upgrades to its AI developer products. Workers AI is now powered by more powerful GPUs across its network of more than 180 cities, allowing it to process larger models such as Meta’s Llama 3.1 70B and Llama 3.2, and tackle more complex AI tasks. AI Gateway, a tool for monitoring and optimizing AI deployments, has been upgraded with persistent logs (currently in beta) that enable detailed performance analytics using search, tagging, and annotation features. Finally, Vectorize, Cloudflare’s vector database, is generally available, supports indexes up to five million vectors, and significantly reduces latency. Additionally, Cloudflare has moved to a simpler, unit-based pricing structure for its three products, making cost management clearer.