‘iPhone of AI’: Startup first to deliver trillion-plus parameter AI model that works in symbiosis with its own chip – SambaNova promises 90% savings on inference costs, but take that with a grain of salt

By James On Mar 3, 2024

While everyone wants in on the action, deploying generative AI at scale has proven to be a significant challenge for large enterprises and government agencies.

Despite recognition of the technology’s potential to streamline processes, reduce costs and improve supply chains, concerns about cost, complexity, security, data privacy, model ownership and compliance have regulations have acted as obstacles to their implementation.

In a potential breakthrough, Softbank-funded SambaNova Systems has announced the launch of Samba-1, the first generative AI model with trillion parameters. Samba-1 is powered by the SambaNova Suite and is designed to meet the demands of performance, accuracy, scalability and total cost of ownership (TCO). The model also promises a 90% reduction in inference costs, although this claim should be approached with caution.

Building the ‘iPhone of AI’

Unlike other trillion-parameter models, which are built as separate, monolithic entities, Samba-1 uses a Composition of Experts (CoE) architecture. This system merges several small “expert” models into one large solution, which functions as one large model. This approach provides broader knowledge on different topics, high accuracy and multimodality.

The CoE model can also reportedly provide more knowledge and accuracy for specialized domains than other major models. Individual smaller models can be trained for specific domains, such as finance, law, physics or biology, and added to the CoE, providing high accuracy for that specific domain without the need to train the entire trillion-parameter model.

The release of Samba-1 follows SambaNova’s announcement of the SN40L, a smart AI chip designed to rival that of AI giant Nvidia. The integration of this chip with the Samba-1 model represents a significant step forward, with SambaNova being the first to deliver an integrated hardware and software system for the enterprise.

“The entire AI industry is talking about building the iPhone by AI – an integrated hardware and software system – and SambaNova is the first to deliver a version of it to the enterprise,” said Rodrigo Liang, co-founder and CEO of SambaNova Systems. “Last fall we announced the SN40L, the smartest AI chip, and now we have integrated that chip with the first 1T parameter model for the enterprise. Samba-1 competes with GPT-4, but is better suited for enterprises because it can be delivered on-premises or in private clouds, allowing customers to fine-tune the model with their private data without ever releasing it into the public domain. ”

Despite Samba-1’s impressive capabilities, the model’s claim to reduce inference costs by 90% should be taken with a grain of salt. Although the CoE architecture offers low inference costs, the true value of these savings will only become apparent once the model is deployed in real-world scenarios.

Liang told us: “AI is not a fad, we are at the beginning of this journey. Our full-stack solution is aimed at large-scale enterprises and government organizations, which no one else can offer on-site and privately. There’s no denying how dominant Nvidia is today, but we can deploy these models at scale at a fraction of the cost.”