Using advanced artificial intelligence to determine critical gaps in the software
Today we share the early results of our CodeMender research, a new agent powered by AI, who automatically improves code safety.
Developers in software security are extremely difficult and time -consuming so that the creators can find and fix, even with traditional, automated methods such as Fuzzing. Our efforts based on AI, such as Great dream AND Oss-Fuzz They showed AI's ability to find new gaps for zero-days in well-tested software. When we achieve more breakthroughs in discovering susceptibility to artificial intelligence, people themselves will be more difficult for people themselves.
CodeMender helps solve this problem by adopting a comprehensive approach to code safety, which is both reactive, immediately easy new gaps in security, as well as proactive, rewriting and securing the existing code, and eliminating entire grades of gaps in this process. Over the past six months, in which we built CodeMender, we have already overtaken 72 security corrections of Open Source projects, including some as large as 4.5 million code lines.
By automatically creating and using high-quality safety patches, an agent powered by AI Codemender helps programmers and carers focus on what they do best-building good software.
CodeMender in Action
Codemender works by using the possibilities of thinking of the newest Gemini deep thought Models to produce an autonomous agent capable of debugging and determining complex gaps in security.
To do this, the Codemender agent is equipped with solid tools that allow him to reason the code before making changes, and automatically check these changes to make sure they are correct and do not cause regression.
Animation showing the process of determining the gaps in security.
While large language models improve quickly, code safety errors can be expensive. The process of automatic checking of CodeMender ensures that code changes are correct in many dimensions, going out only on slices of high -quality human review, which, for example, repair the basic cause of the problem, are functionally correct, do not cause regression and compliance with style guidelines.
As part of our research, we have also developed new techniques and tools that allow CodeMender to reason the code and confirm the changes more effectively. This includes:
- Advanced program analysis: We have developed tools based on an advanced program analysis that include static analysis, dynamic analysis, varied tests, blur and Solvers SMT. By using these tools for systematic examination of code patterns, control flow and data flow, Codemender can better identify the basic causes of security defects and architectural weaknesses.
- Systems of many agents: We have developed special funds that enable CodeMender to solve specific aspects of the problem underlying the problem. For example, CodeMender uses a large critic based on a language model, which emphasizes the differences between the original and modified code to check whether the proposed changes do not introduce regression and if necessary conscience.
Repairing the gaps
To effectively patch the susceptibility and prevent its re -appearance, Code Mender uses debugger, source code browser and other tools to indicate the main reasons and develop patches. We have added two examples of patching codemes in security in the video carousel below.
Example No. 1: Identification of the basic cause of susceptibility
Here is a fragment of the agent's justification about the main cause of the patch generated by CodeMender, after analyzing the results of Debugger's output and code search tool.
Although the final patch in this example changed only a few lines of the code, the basic reason for the susceptibility was not immediately clear. In this case, the emergency report showed the overflow of the stroke buffer, but the actual problem took place elsewhere – incorrect management of the extended tongue language (XML) during analysis.
Example No. 2: The agent is able to create non -trivial patches
In this example, Agent Codemender was able to come up with a non -skilled patch that concerns a complex problem with the life of the object.
The agent was able not only to determine the basic cause of susceptibility, but was also able to modify a completely non -standard system to generate C code under the project.
Proactively rewriting the existing code for better safety
We also designed CodeMnder for proactive rewriting of the existing code to use safer Data Structures and API interfaces.
For example, we have implemented Codemender for use -Fbounds-Safety Annotations for a part of a commonly used image compression library called Libwebp. When -Fbounds-Safety Annotations are used, the compiler adds border controls to the code to prevent the attacker from using the buffer overflow or approached any code.
A few years ago in Libwebp in susceptibility to the buffer in Libwebp (CVE-2023-4863) was used by the actor threatened as part Zero IOS Exploit click. WITH -Fbounds-Safety Annotations, this sensitivity, along with most of other buffer transfers in the project in which we used annotations would be forever inefficient.
In the video carousel below we show examples of the agent's decision -making process, including the stages of validation.
Example No. 1: Agent's reasoning steps
In this example, Agent Codemender is asked to solve the following -Fbounds-Safety error on bit_depths indicator:
Example No. 2: The agent automatically corrects errors and test failures
Another of the key functions of CodeMender is his ability to automatically correct new errors and all test failures resulting from his own annotations. Here is an example of recovering the agent from a compilation error.
Example No. 3: The agent checks the changes
In this example, Agent Codemender modifies the function, and then uses the tool of the LLM referee configured to functional equivalence to check whether the functionality remains intact. When the tool detects the failure, the self -control agent is based on the feedback of the LLM judge.
Maintenance of software for everyone
While our early results from Codemender are promising, we take a cautious approach, focusing on reliability. Currently, all patches generated by Codemender are checked by human researchers before they are sent up.
Using Codemender, we have already started sending patches to various critical open source libraries, many of which have already been accepted and above. We are gradually increasing this process to ensure quality and systematically give feedback from open source communities.
We will also gradually contact the interested carers of critical Open Source projects with CodeMender patches. Iteraly feedable information from this process, we hope to issue Codemender as a tool that can be used by all programmers to ensure the safety of your code databases.
We will have many techniques and results to make available, which we intend to publish as technical documents and reports in the coming months. Thanks to CodeMender, we have just started to discover the amazing AI potential to increase the safety of software for everyone.
Thanks
Loans (listed in alphabetical order):
Alex Rebert, Armman Hasanzadeh, Carlo Lemos, Charles Sutton, Dongge Liu, Gogul Balakrishnan, HIEP CHY, James Zern, Koushik Sen, Liang, Liang, Max Shavrick, Oliver Chang and Petros Manic.