‘Today’s computers are terribly inefficient’: How an American startup sides with Apple: combining hardware and software to solve AI’s major 99% energy consumption problem
Efficient Computer lives up to its name by creating what it describes as the most energy efficient programmable processor.
The startup emerged from stealth in March 2024 with $16 million in seed funding led by Eclipse VC, and claims to have built a completely new technology stack, from compiler to silicon, in a year.
The company’s approach is to create what it describes as a “general-purpose post-von Neumann processor design that is easy to program and also extremely power efficient.”
Efficient structuring of memory
Brandon Lucia, Founder and CEO of Efficient Computer said: “Today’s computers are terribly inefficient. The dominant ‘von Neumann’ processor design wastes 99% of the energy. Unfortunately, this inefficiency is deeply ingrained in their design. In von Neumann processors, programs are expressed as a sequence of simple instructions, but executing programs in a simple sequence is unacceptably slow. Improving performance requires complex hardware to find instructions that can be safely executed in parallel. Improving efficiency requires a fundamental rethink of the way we design computers.”
What that means in practice is that instead of executing a series of instructions, as von Neumann designs, its architecture “expresses programs as a ‘circuit’ of instructions that shows which instructions talk to each other.” This design, called Fabric processor architecture, has been implemented in the Monza test SoC.
Lucia was recently interviewed by eeNews Europe and further explained what the company’s approach entails. “What’s fundamentally different is that the architecture was developed simultaneously with a compiler and a software stack based on research at Carnegie Mellon, and we designed it with generality in mind,” he said. “We don’t need any register flow and we don’t need to fetch instructions every cycle. Some of the tiles are also memory access tiles – that’s an efficient way to structure memory.”
Initial performance is 1.3 to 1.5TOPS/W, 500 mW to 600 mW for the chip, but that’s really just the beginning. “As we look to the future, we have a roadmap to scale the architecture as we explore the design space. By early 2025, we can achieve 100 GOPS on 200 MHz and we think we can scale that performance 10 to 100x with the same efficiency,” he said in the interview.