Hidden prejudice in large language models

Large language models (LLM), such as GPT-4 and Claude, have completely transformed artificial intelligence thanks to their ability to process and generate a text-like text. But under their powerful possibilities lies a subtle and often overlooked problem: bias of position. This applies to the tendency of these models to process information located at the beginning and end of the document, while neglecting the content inside. This prejudice can have significant real consequences, potentially leading to inaccurate or incomplete answers from AI systems.

The MIT scientists team has now indicated the basic reason for this defect. Their study reveals that the bias of positions are not only due to training data Used to teach LLM, but from the basic design choices in the architecture itself-especially the way in which models based on transformers deal with attention and positioning of words.

Transformers, the architecture of the neural network with most LLM, work, coding sentences in tokens and learning how these tokens relate to each other. To understand long text sequences, models use attention mechanisms. These systems allow tokens to selectively “focus” on related tokens elsewhere in the sequence, helping the model understand the context.

However, due to the huge calculation costs to allow each token to take care of every other token, developers often use causal masks. These restrictions limit each token to consider only previous tokens in the sequence. In addition, position coding is added to help track the order of words.

The MIT team has developed a theoretical framework based on charts to examine how these architectural choices affect the flow of attention in the models. Their analysis shows that causal masking by nature warns models at the beginning of the contribution, regardless of the meaning of the content. In addition, as more layers of attention are added – the widespread strategy of increasing the performance of the model – this prejudice becomes stronger.

This discovery is in line with the real challenges that developers working on the AI ​​systems used. Learn more about QDAT's experience building a smarter recovery generation system (RAG) using GRAPH databases. Our case study applies to some of the same architectural restrictions and shows how to maintain structured relationships and contextual significance in practice.

According to Xinyi Wu, a doctoral student MIT and main author of the study, their frames helped show that even if the data is neutral, architecture itself can distort the focus of the model.

To test their theory, the team conducted experiments in which the correct answers in the text were placed in different positions. They found a clear U-shaped pattern: The models worked best when the answer was at the beginning, slightly worse at the end and the worst in the middle-the symptom, which they called “lost in medium”.

However, their work also discovered potential ways to alleviate this prejudice. The strategic use of positional codes that can be designed to strongly combine tokens with nearby words can significantly reduce position bias. Simplification of models by reducing the number of layers of attention or testing alternative masking strategies can also help. While model architecture plays an important role, it is important to remember that biasing training data can continue to strengthen the problem.

This study provides valuable insight into the internal functioning of AI systems, which are increasingly used in high rate domains, from legal tests to medical diagnostics to generating code.

As emphasized by Ali Jadbabaie, professor and head of the civil engineering department and the myth environment, these models are black boxes. Most users do not realize that the input order may affect the output accuracy. If they want to trust artificial intelligence in critical applications, users must understand when and why they fail.

LEAVE A REPLY

Please enter your comment!
Please enter your name here