James in Tech

ChatGPT o1 preview can solve riddles faster than me and I kinda hate it

When OpenAI released the much-hyped Strawberry model for ChatGPT this week, it boasted in a series of videos about its prowess with complex logic like software coding, gene sequencing, and quantum physics. I’ll take the company at its word that the models, called o1-preview and o1-mini on ChatGPT, are capable of what they claim. Crunching advanced equations and exploring genomes seems like something they could do without a hitch.

But as a proud member of my high school logic and riddle club, I wanted to know how I was doing in my field, solving and creating puzzles and riddles. And then I thought I should ask the super-logical AI for advice on other, more mundane matters. Could it give good relationship advice, tell me what a strange noise in a car meant, and maybe even fill in plot holes in movies?

(Image credit: Screenshot / Eric Hal Schwartz)

(Image credit: Screenshot Eric Hal Schwartz)

Logic yes humor no

The short answer is yes. The o1 preview and mini models are very good at solving simple and complex puzzles. I played with both and the only real difference was how many extra steps and therefore the speed of the mini. But while they may be slower than GPT-4o, they are very fast at solving those puzzles compared to a human. You can especially see how it displays the answers in different steps. I tested it on a few of my favorites, including one of The HobbitThe AI’s logic was logical, although sometimes ungrammatical, such as when it explained how to weigh Mike the butcher.

Okay, so it could handle existing riddles, but could it create a new one? As a test, I asked it to come up with a fun riddle based on an answer I had come up with. After 30 seconds and the logical reasoning you can see below, it came up with: “What has eight legs, four ears, two tails, and likes to bark?” I won’t keep you in suspense; I suggested “two dogs” as the answer to work backwards from. Several more attempts yielded the same kind of question. So, riddle writers are probably safe in their jobs. It’s impressive how well the AI gets what it’s supposed to do, but the model doesn’t seem capable of making the leap to real humor.

(Image credit: Screenshot / Eric Hal Schwartz)

Useful advice, but not always creative

I decided to take the AI beyond pure logic and see if it could handle more mundane life questions as well as quantum physics. I started with a mechanical question about what it means to hear a popping noise every 20 seconds while driving and how to fix it. The answers were good, with advice on checking the tires, engine, exhaust, and brakes. The solutions were mostly about taking the car in for repairs, except for the tires, which it suggested how to replace. It was the “thinking” behind the answers that was interesting. The AI uses first-person pronouns to come up with answers, such as “I am going through various reasons for a popping noise while driving” and “I am investigating causes of engine failures, such as faulty spark plugs or fuel delivery problems, and suggesting diagnoses with a scan.” It sounded a lot like a real person trying to be logical while thinking out loud.

Finally, I moved on to what has always been far more complex than quantum physics for me: flirting. I asked how to tell when someone is flirting and how to respond. The answer was a pretty solid, if boring, list of behaviors, like whether they ask a lot of questions and how I should be myself. The behind-the-scenes thinking was both more interesting and genuinely funnier than any of the AI’s attempts at riddles. The headings included “Understanding flirting dynamics,” “Spotting signals of interest,” and “Spotting playful intimacy.” They were like a Star Trek android’s speech about love.

One part was a little concerning, though. Under “Outline User Guidelines,” the AI wrote, “I clean up inappropriate content like non-consensual sex acts and personal information. Violent content is allowed, harassment with context is OK, and personal opinions are absent.” I suspect it’s more about where the boundaries of discussion lie, since it didn’t suggest “harassment with context” as a flirting tip, but it still caught me off guard.

ChatGPT o1-preview and o1-mini don’t have all the bells and whistles of the more complete models. They can’t upload images, analyze documents, or even surf the web. But they are fast and logical, and if you don’t think so, they have their reasoning and their answers straight out. But while they may be able to solve riddles about car noises, love, and the weight of a butcher, I’d say they’re not going to offend anyone when it comes to being inventive.

Logic yes humor no

Useful advice, but not always creative

You might also like…