Recent artificial intelligence models, ChatGPT o3 and o4-mini, are experiencing hallucinations twice as often as less advanced, non-reasoning versions. This information comes from data derived from an OpenAI test.
In the context of neural networks, hallucinations refer to responses that are inconsistent with reality, which AI produces with complete confidence in their accuracy. Specifically, during the PersonQA test, which assesses ChatGPT`s knowledge of people, o3 hallucinated in 33% of cases, and o4-mini in 43% of queries. For comparison, this figure did not exceed 15% for o3-mini.
Another test, called Transluce and conducted by an independent developer, revealed that the o3 model generally tends to invent its actions. For example, in one query, the AI responded that it had launched program code on an Apple MacBook Pro 2021 “outside of ChatGPT” and copied numbers into its response. However, in practice, the algorithm did not provide it with such capabilities.
One way to combat hallucinations is to provide AI with the option to use web search, where reliable information is provided more effectively. This method worked in the non-reasoning o4 model, so developers hope that it will also help more advanced artificial intelligence.