The problem of hallucinations and is getting worse

Despite significant progress in artificial intelligence, an adverse trend appears: the latest and most sophisticated AI models, especially those using complex “reasoning”, show A significant increase in inaccurate and created information. This phenomenon is commonly referred to as “hallucinations”. This development is mysterious for industry leaders and creates significant challenges for the widespread and reliable use of AI technology.

Recent testing of the latest models of main players, such as OpenAI and Deepseek, reveals a surprising reality: these allegedly more intelligent systems generate incorrect information with higher rates than their predecessors. OpenAI assessments, described in detail in Last research articleThey showed that their latest models O3 and O4-Mini, released in April, suffered from significantly increased hallucinations compared to their earlier O1 model from the end of 2024, for example, summarizing questions about public numbers, O3 Halucid 33% of the time, while O4-Mini did this breakdown of 48% of time. Otherwise, the older O1 model had a hallucinations indicator of only 16%.

The problem is not isolated from Opeli. Independent tests by Vectarwho assesses AI models indicates that several models of “reasoning”, including R1 Deepseek, have experienced a significant increase in hallucinations compared to previous iteration of the same programmers. These reasoning models have been designed to imitate human thought processes by distributing problems in many steps before obtaining an answer.

The implications of this growth in inaccuracies are significant. Because AI chatbots are increasingly integrated with various applications – from customer service and research assistance to legal and medical fields – the reliability of their production becomes the most important. Customer service bot providing incorrect information about the principles experienced by users of the cursor programming tool or legal artificial intelligence, citing non -existent case -law, can lead to significant user frustration, and even serious consequences in the real world.

While AI companies initially expressed optimism that hallucinations would naturally fall along with models updates, the latest data paint a different image. Even OpenAI admits this problem, and the company spokesman states: “Hallucinations are not more widespread by nature in the models of reasoning, although we are actively working on reducing higher hallucinations in O3 and O4-Mini.” They maintain that research on the causes and relief of hallucinations in all models remain a priority.

The basic reasons for this increase in errors in more advanced models remain a bit elusive. Due to the volume of data in which these systems are trained and the complex mathematical processes they use, indicating the exact causes of hallucinations is a significant challenge for technologists. Some theories suggest that step by step the process of “thinking” in reasoning models can create more errors related. Others suggest that training methodologies, such as reinforcement learning, although beneficial for tasks such as mathematics and coding, may have unintentionally worsen the actual accuracy in other areas.

Scientists actively study potential solutions to relieve this growing problem. The studied strategies include training models for recognizing and expressing uncertainty, as well as the use of extended extracting techniques that allow artificial intelligence to refer to external, verified sources of information before generating answers.

However, some experts warn against assigning errors and the term “hallucination” itself. They claim that he implicitly implies the level of consciousness or perception that AI models do not have. Instead, they perceive these inaccuracies as the basic aspect of the current probabilistic nature of language models.

Despite the constant efforts to improve accuracy, the recent trend suggests that the road to a really reliable AI can be more complex than initially expected. For now, users are recommended to be caution and critical thinking when interacting with even the most advanced AI chatbots, especially when looking for actual information. It seems that the “growing pain” of the development of artificial intelligence is far from the end.

LEAVE A REPLY

Please enter your comment!
Please enter your name here