A tiny startup has helped Intel trounce AMD and Nvidia in critical AI tests — is it game over already?
Numenta has shown that Intel Xeon CPUs can vastly outperform the best CPUs and the best GPUs in AI workloads by applying a new approach to them.
Using a series of techniques based on this idea, branded under the Numenta Platform for Intelligent Computing (NuPIC) label, the startup has unlocked new levels of performance in conventional CPUs based on AI inference, according to Serve the House.
The really amazing thing is that it apparently outperforms GPUs and CPUs specifically designed to handle AI inference. For example, Numenta took a workload for which Nvidia was reporting performance numbers with its A100 GPU, and ran it on an augmented 48-core 4th generation Sapphire Rapids CPU. In all scenarios, it was faster than Nvidia’s chip, based on overall throughput. In fact, it was 64 times faster than a 3rd generation Intel Xeon processor and ten times faster than the A100 GPU.
Improving AI performance with neuroscience
Known for its neuroscience-inspired approach to AI workloads, Numenta relies heavily on the idea of sparse computing – the way the brain forms connections between neurons.
Most CPUs And GPUs today are designed for dense computing, especially AI, which is brute force rather than the contextual way the brain works. While thrift is a surefire way to improve performance, CPUs can malfunction that way. This is where Numenta steps in.
This startup wants to leverage the efficiency gains of sparse computing in AI models by applying its ‘secret sauce’ to general-purpose CPUs instead of chips purpose-built to handle AI-focused workloads.
Although it can run on both CPUs and GPUs, Numenta took Intel
These are extensions of the x86 architecture – serving as additional instruction sets that allow CPUs to perform more demanding functions.
Numenta delivers its NuPIC service using Docker containers and can run on a company’s own servers. If it worked in practice, it would be an optimal solution for repurposing CPUs already deployed in data centers for AI workloads, especially in light of the long latencies on Nvidia’s industry-leading A100 and H100 GPUs.