Because AI language models are becoming more and more sophisticated, they play a key role in generating text in various domains. However, ensuring the accuracy of the information they produce remains a challenge. Formation, unintentional errors and biased content can quickly propagate, affecting decision -making, public discourse and users' trust.
Department of Research Deepmind Google presented a powerful AI facts tool Especially designed for large language models (LLM). The tool called Safe (semantic accuracy and facts assessment) aims to increase the credibility and credibility of the content generated by AI.
Safe works on a multi -faceted approach, using advanced AI techniques for meticulous analysis and verification of actual claims. The system analysis of the system distributes information separated from long texts generated by LLM into separate, independent units. Each of these units is strictly verified, with the safe use of Google search results to carry out a comprehensive matching of facts. What distinguishes safe is the inclusion of multi -clutch reasoning, including generating search questions and subsequent analysis of search results in order to determine the factual accuracy.
During extensive tests, the research team used safely to verify about 16,000 facts contained in the results given by several LLM. They compared their results against human (crowdsourcing) checking the facts and found that safe suited the findings of specialists in 72% of cases. In particular, in cases where discrepancies arose, safely exceeded human accuracy, achieving an unusual accuracy rate of 76%.
The benefits of Safe go beyond its exceptional accuracy. It is estimated that its implementation is about 20 times more profitable than relying on human checking facts, which makes it a financially profitable solution for processing huge amounts of content generated by LLM. In addition, Safe's scalability makes it good to solve the challenges, which is an interpretative increase in information in the digital era.
While Safe is a significant step forward for the further development of LLM, challenges remain. Ensuring that the tool remains on a regular basis with developing information and maintaining a balance between accuracy and performance is tasks.
Deepmind has provided a secure code and a set of comparative data in public on github. Scientists, programmers and organizations can take advantage of the possibility of improving the credibility of the content generated by AI.
Go into the world of LLM and explore efficient solutions for problems with text processing using large language models, Llama.cpp and the Library of Tips in our last article “Optimization of text processing using LLM. Insight into Llam. CPP and tips”.