‘Everything is moving too fast’: We test out the new GPT-4, and it’s astounding

By William On Mar 15, 2023

I’ve been writing about technology for 25 years and I’ve never come across anything as fascinating as ChatGPT.

When I see the reactions, I often get a feeling of dizziness, as if everything is going too fast. And everything just got a little bit faster.

Last night, OpenAI announced and launched the latest version of the model underlying ChatGPT, GPT-4.

The new version offers several advanced capabilities, including the ability to take legal exams, understand images, and handle prompts up to 25,000 words.

Users have demonstrated how to create Pong and Snake in Javascript in less than 60 seconds, write endless bedtime stories for kids, file one-click lawsuits to deal with robo callers, and even build web pages with handwritten notes .

So how is GPT-4 fun to use?

I’ve been writing about technology for 25 years and I’ve never come across anything as fascinating as ChatGPT, writes Rob

Users have shown how GPT-4 can code the game Pong in 60 seconds (Twitter)

I tried it out through ChatGPT Plus, a $20-a-month subscription from OpenAI, which now offers a stripped-down version of GPT-4 (it can’t create images or lengthy prompts yet, but can provide more creative answers).

It’s also available through Microsoft’s Bing, where it’s been quietly searching for the past six weeks – wider access to various levels of GPT-4 is coming.

The lengthy part of prompts alone will, I suspect, be a game changer (although it doesn’t quite work through ChatGPT just yet).

All of a sudden, ChatGPT transitions from a new tool to something I can see being used in the workplace.

For anyone whose job it is to summarize information (doctors, journalists, lawyers), processing 25,000 words into bullet points or shorter texts is a groundbreaking new skill.

So is it very different?

It’s noticeably better at certain things than GPT-3.5, which ChatGPT previously ran on (you can switch between the two in ChatGPT Plus).

Replies tend to be longer and more human – ChatGPT also claims that it’s harder to ‘trick’ the bot into saying harmful things, and it didn’t fall for several tricks I tried.

GPT-4 is noticeably more fun.

GPT-4 may aid in drug discovery (Twitter)

In general, it is better at creative tasks and much better at writing ‘in the style of’ someone – for example, it ‘gets’ the sound of Shakespeare much better than its predecessor.

It’s also noticeable that when you ask GPT-4 to do emails and tweets, the formatting is closer to the real version – you could copy and paste and post immediately (they come with emojis).

Both ChatGPT 3.5 and ChatGPT 4 like to role-play in response to the question, “Can you pretend to be a friendly goblin I met in a forest.”

GPT-4 describes Trump as ‘divisive and harmful’ but claims Biden’s presidency has ‘challenges and shortcomings’ (OpenAI)

It came with some very strange excuses (OpenAI)

The ChatGPT4 version has a lot more personality – the goblin is named and feels more like a human-written character, and the world is less like a story written by a 10-year-old.

GPT-4 also seems better at telling jokes – and the responses are generally more fleshed out and more appropriate for the audience.

That said, it’s still prone to downright weird stuff.

Ask it to generate a biography of someone who is semi-famous (I chose a novelist friend) and it generates a weird soup of fact and fiction – which was so convincing that I had to visit Amazon to check if there no other author of the same name was .

The ‘biography’ contains a date of birth very close to my boyfriend’s real date of birth, and a wrong place of birth and also claims that he has won several literary awards, which is not the case.

Even with innocuous tasks like email generation, GPT-4 still comes up with some very puzzling things.

It emphasizes AOC in positive terms (OpenAI)

Lauren Boebert is described as ‘detrimental to political discourse’ (OpenAI)

I asked GPT 3.5 and GPT 4 to generate an email saying I would be late in submitting my copy and come up with a convincing excuse.

GPT-3.5 came up with a vague excuse that research would take longer – while GPT-4 came up with a non-existent specialist I supposedly interviewed.

If I had actually used this, my editor would have thought I was crazy.

DoNotPay – an online chatbot for legal services – is working on using the software to generate instant “one-click lawsuits” for people harassed by robocallers, automatically suing $1,500.

GPT-4 users could also generate games like Pong and Snake in minutes just by describing them and specifying an encoding language. Users could also create the Connect 4 board game using a similar command.

Others showed how the bot could create personalized bedtime stories for kids in response to simple cues.

GPT-4 is funnier than GPT-3

But it’s still fairly awake and prone to dismissive responses about right-wing politicians like Lauren Boebert and Donald Trump.

Many of his answers to controversial topics seem tinged with a leftist point of view.

There’s no question that GPT-4 has groundbreaking potential – with demos showing how it creates entire websites from one scanned sheet of notes and invents new drugs.

It’s a technology that I must admit I look at with a mixture of interest and fear – because there’s no way this genie is going back in the bottle.

Why is GPT-4 making up so many facts?

ChatGPT has a problem with the truth (Getty)

The reason ChatGPT tends to come up with “facts” that are totally wrong is due to the data it’s been trained on, says Aaron Kalb, chief strategy officer and co-founder of data intelligence firm Alation.

Kalb says, “If GPT is trained on publicly available data, meaning it does not contain proprietary information needed to accurately answer specific questions, it cannot be trusted to advise on important decisions.

That’s because it’s designed to generate content that just looks good with great flexibility and fluidity, which creates a false sense of credibility and can result in so-called AI hallucinations.

“While its authenticity and ease of use are what make GPT so appealing, it’s also its most glaring limitation.

“GPT is incredibly impressive in its ability to sound smart. The problem is that it still has no idea what it’s saying. It doesn’t have the knowledge it’s trying to put into words. It’s just really good at knowing which words “feel good” come after the previous words, as it’s effectively read and remembered all over the web. It often gets the right answer as humanity has collectively posted the answer online repeatedly for many questions.