Obscure startup wins prestigious CES 2024 award — you’ve probably never heard of it, but Panmnesia is the company that could make ChatGPT 6 (or 7) times faster
The coveted Innovation Award at the upcoming Consumer Electronics Show (CES) 2024 event in January has been scooped by a Korean startup for its AI accelerator.
Panmnesia has built its AI accelerator device on Compute Express Link (CXL) 3.0 technology, which allows an external memory pool to be shared with host computers and components such as the CPU, which can translate into virtually unlimited memory capacity. This is thanks to the integration of a CXL 3.0 controller in the accelerator chip.
CXL is used to connect system devices including accelerators, memory expanders, processors, and switches. By connecting multiple accelerators and memory expanders using CXL switches, the technology can provide sufficient memory to an intensive system for AI applications.
What CXL 3.0 means for LLMs
Using CXL 2.0 in these types of devices would allow certain hosts to access their dedicated portion of the pooled external memory, while the latest generation of hosts would allow access to the entire pool as and when needed.
“We believe our CXL technology will be a cornerstone for the next generation AI acceleration system,” said Panmesia founder and CEO Myoungsoo Jung in a statement rack.
“We remain committed to revolutionizing not only the AI acceleration system, but also other general environments such as data centers, cloud computing and high-performance computing.”
Panmnesia's technology works similar to how clusters of servers can share external SSDs to store data, and would be especially useful for servers as they often need access to more data that they can store in onboard memory.
This device is purpose-built for large-scale AI applications – and its makers claim that it is 101 times faster at performing AI-based search functions than conventional services, which use SSDs to store data, linked over networks. The architecture also minimizes energy costs and operational expenses.
If used in the configuration of servers that use OpenAI, for example, to host large language models (LLMs) such as ChatGPT, alongside hardware from other vendors, it could dramatically improve the performance of these models.