OpenAI proudly debuted ChatGPT search in October as the next phase for search engines. The company boasted that the new feature combined ChatGPT’s conversational skills with the best web search tools, offering real-time information in a more useful form than any list of links. According to a recent judgement by Columbia University’s Tow Center for Digital Journalism, that celebration may have been premature. The report shows that ChatGPT has a somewhat lassie-faire attitude toward accuracy, attribution, and basic reality when searching for news stories.
What’s particularly striking is that the problems occur regardless of whether a publication blocks OpenAI’s web crawlers or has an official licensing agreement with OpenAI for its content. The study tested 200 quotes from 20 publications and asked ChatGPT to source them. The results were everywhere.
Sometimes the chatbot was right. Other times, quotes were attributed to the wrong outlet or a source was simply made up. OpenAI’s partners, including The Wall Street Journal, the Atlantic Ocean, and Axel Springer and Meredith’s publications sometimes did better, but not with any consistency.
Gambling on accuracy when asking ChatGPT for the news is not what OpenAI or its partners want. The deals were announced as a way for OpenAI to support journalism while improving the accuracy of ChatGPT. When ChatGPT turned to Politicspublished by Axel Springer, for quotes the speaker was often not the one quoting the chatbot.
AI news to lose
The short answer to the problem is simply ChatGPT’s method of finding and processing information. The web crawlers ChatGPT uses to access data may perform perfectly, but the AI model underlying ChatGPT can still make mistakes and hallucinate. Licensed access to content does not change that basic fact.
Of course, if a publication blocks web crawlers, ChatGPT can accurately go from newshound to wolf in sheep’s clothing. Outlets that use robots.txt files to keep ChatGPT away from their content, such as The New York Times, let the AI plod along and come up with sources, instead of saying it doesn’t have an answer for you. More than a third of the responses in the report fit this description. That’s more than a small encryption solution. Worse still, if ChatGPT were unable to access legitimate sources, it would turn to places where the same content was published without permission, perpetuating plagiarism.
Ultimately, AI misattribution is not as big of a problem as the implications for journalism and AI tools like ChatGPT. OpenAI wants ChatGPT search to be the place people go for fast, reliable answers that are properly linked and cited. Failure to do so will undermine confidence in both AI and the journalism that encapsulates it. For OpenAI’s partners, the revenue from their licensing agreement may not be worth the lost traffic due to unreliable links and citations.
So while ChatGPT search can be a boon for many activities, you’ll want to check those links if you want to make sure the AI isn’t hallucinating answers from the Internet.