‘Feels like magic!’: Groq’s ultra-fast LPU could be the first LLM-native processor – and the latest demo could convince Nvidia and AMD to get out their checkbooks

By James On Feb 27, 2024

Groq, led by former Google engineer and CEO Jonathan Ross, claims to have created the first-ever Language Processing Unit (LPU), which it says can deliver the fastest speeds for AI applications.

It’s a bold claim, but one that is amply supported by the latest demos, suggesting it could be an absolute game-changer for AI.

Ross, who previously designed Google’s tensor processing unit (TPU), launched Groq in 2016 to create a chip that can perform deep learning inference tasks more efficiently than existing CPUs and GPUs.

Lightning fast

The company’s Tensor Stream Processor (TSP) has been likened to an assembly line, processing data tasks in a sequential, organized manner. A GPU, on the other hand, resembles a static workstation, where workers come and go to apply processing steps. The efficiency of the TSP became apparent with the rise of generative AI, which led Ross to rename the TSP the Language Processing Unit (LPU) to increase its recognisability.

Unlike GPUs, LPUs use a streamlined approach, eliminating the need for complex scheduling hardware, ensuring consistent latency and throughput. LPUs are also energy efficient, reducing the overhead of managing multiple threads and avoiding underutilization of cores. Groq’s scalable chip design allows multiple TSPs to be connected without traditional bottlenecks, simplifying hardware requirements for large-scale AI models.

Groq’s first public demo was a lightning-fast AI response engine that generated responses containing hundreds of words in less than a second. Matt Shumer posted the test on X and says that more than three-quarters of the time was spent searching and not generating.

The first public demo using Groq: a lightning-fast AI Answers Engine. He writes factual, quoted answers with hundreds of words in less than a second. More than 3/4 of the time is spent searching, not generating! The LLM runs in a fraction of a second.https://t.co/dVUPyh3XGV https://t.co/mNV78XkoVB pic.twitter.com/QaDXixgSzpFebruary 19, 2024

While that’s impressive, seeing Groq go head-to-head with Chat-GPT is another thing.

If you’d like to try Groq for yourself, to get an idea of how fast it can be for AI, head over to this chat page. Use the drop-down menu on the left to switch between the different models available.