Sales of AI smartphones and laptops are said to be slowly declining – but is anyone surprised?
Almost every year we get a report saying that something in the PC industry is dying, or fading, or that the days of some aspect of computer technology are numbered.
So when I got a article about Micron not selling enough memory chips for AI PCs and smartphones, which meant the company lowered its revenue expectations for the coming quarters, and so some people are panicking that ‘AI is dying’ – well, it surprised not me at all.
This industry enjoys a bit of doom and gloom every now and then, but much of this wandering noise is purely down to the public’s understanding of modern AI as a whole – especially in the enthusiast sector.
Let me be clear: AI is not dying – we know that. All you have to do is look at how well Nvidia is doing to fully understand how wrong that statement is. The point is that of all the countless AI laptops and phones and other gadgets out there – everything currently marketed with the AI slogan (I’ll go into a long argument about that here) – the fact is that the vast majority of AI processing doesn’t come from your little laptop. That’s just not the case.
Even today’s best custom gaming PC barely has the ability to run ChatGPT at 10% of its total capacity. And that’s even if you could, since it’s not an open source program that anyone can just download.
Unfortunately, it requires far too much data and processing power to fully simulate those types of programs locally on the desktop. There are workarounds and alternative apps, but they generally pale in comparison to the likes of Gemini or GPT, both in depth of knowledge and response times. Not exactly surprising, considering you’re trying to compete with multiple server blades running in real time. I’m sorry, your RTX 4090 just isn’t going to cut it, my friend.
And that’s another important point: even if you look at your custom PC, anyone who tells you that a CPU with a built-in NPU can beat something like an aging RTX 3080 in AI workloads will confuse you. Use something like UL’s Procyon benchmark suite with its AI Computer Vision test, and you’ll see that the results for a desktop RTX 4080 versus an Intel Core Ultra 9 185H-powered laptop are about 700% to 800% higher. That’s not a small margin, and that gives the Intel chip the benefit of the doubt and not also using the Nvidia TensorRT API, where the results are even better for Team Green.
The point is that the companies, tools and techniques that are doing well in the AI ecosystem are already well established. If you have an RTX graphics card, chances are you already have enough performance to run around most modern ‘AI’ CPUs with a built-in NPU. Second, virtually every AI program worth running uses server blades to deliver that performance – very few run locally or don’t have some form of connection to the cloud.
Google has now pretty much rolled out Gemini to most of its Android OS devices, and it’ll also land on its Nest speakers in the coming months (with a beta version technically already available, thanks to some nice Google Home Public Example of Cheating ). And just to be clear, that’s a four-year-old speaker at this point, not exactly cutting-edge technology.
This is just the beginning
Many years ago I had a conversation with Roy Taylor, who at the time worked at AMD as Corporate Vice President Media & Entertainment, specializing in VR and developments in that field.
My memory is a bit hazy, but the long and short of the conversation was that as far as graphics card performance was concerned, a lifelike experience in VR had to be achieved, with a high enough pixel density and enough frame rate to ensure a human wouldn’t be able to tell the difference, we would need GPUs that can deliver petaflops of performance. I think the exact figure was around 90 PFLOPs (for reference, an RTX 4090 is still well over 100x less powerful than that).
To me, it feels like local AI falls into the same camp. It’s a realm of apps, utilities and tools that will likely never appear on your local gaming PC, but will instead reside exclusively on server blades and supercomputers. There is simply no way an isolated computer system can compete – even if we stopped all AI development as it stands, it would take us years to catch up in terms of overall performance. That’s not necessarily a bad thing or the end of the world either.
There is a silver lining for us off-the-grid folks, and it all depends on GPU manufacturers. Of course, AI programming, and machine learning in particular, works primarily through parallel computing. This is something that GPUs are extremely good at, much better than CPUs, and especially Nvidia GPUs that use Tensor cores. It’s the technology behind all those DLSS and FSR models we know and love, boosting frame rates without sacrificing in-game graphical fidelity.
However, developing a GPU from the ground up takes time – a lot of time. For a brand new architecture, we are talking several years. That means the RTX 40 series was likely in development in 2020/2021, at an estimate, and similarly the RTX 50 series (when the next generation arrives, presumably imminent) likely started in 2022/2023, with several teams milling around. from task to task as they became available. All this prior to the thawing of the most recent AI winter and the arrival of ChatGPT.
What that tells us is that unless Nvidia can radically change its designs, it’s likely that the RTX 50 series will continue the success of Lovelace (RTX 40 series), which will certainly give us even better AI performance. But it won’t be until the RTX 60 series that we really see AI capacity and performance increase in a way we haven’t seen before with these GPUs. That could be the generation of graphics cards that could make localized LLMs a reality instead of a pipe dream.