The development of a unique and promising research hypothesis is a basic skill for every scientist. It can also be time consuming: new doctoral students can spend the first year of their program, trying to decide exactly what to investigate in their experiments. What if artificial intelligence can help?
MIT researchers have created a way to autonomous generation and evaluation of promising research hypotheses in various fields through the cooperation of man-Ai. In the new article, they describe how they used this framework to create hypotheses based on evidence, which comply with unmet research needs in the field of biologically inspired materials.
Published on Wednesday in Advanced materialsThe study was co -authored by Alicza Ghafarollahi, Postdoc in the Atomistic and Molecular Mechanics Laboratory (LAMM) and Markus Buehler, Professor Jerry McAfee in the MIT engineering in civil engineering and the environment as well as the director of Lamm.
Frames that researchers call scigents consist of many AI agents, each with specific possibilities and access to data that use the methods of “graphing reasoning” in which AI models use a chart of knowledge that organizes and defines relationships between various scientific concepts. The approach of many agents imitates the way biological systems organize as groups of elementary construction blocks. Buehler notes that this “divided and conquering” principle is a significant paradigm in biology at many levels, from material to the swarm of insects to civilizations – all examples in which total intelligence is much greater than the sum of the ability of individuals.
“By using many AI agents, we try to simulate a process in which scientists communities make discoveries,” says Buehler. “In the myth we do it with a group of people with different environments who work together and falling into each other in cafes or in the infinite corridor MIT. But this is very random and slow. Our mission is the simulation of the discovery process through examination, whether AI systems can be creative and make discoveries.”
Automation of good ideas
As recent changes have shown, large language models (LLM) showed an impressive ability to answer questions, summarize information and perform simple tasks. But they are quite limited when it comes to generating new ideas from scratch. Myth researchers wanted to design a system that enabled AI models to perform a more sophisticated, multi -stage process that goes beyond recalling information learned during training, extrapolation and creating new knowledge.
The basis of their approach is the ontological chart of knowledge, which organizes and establishes connections between various scientific concepts. To make charts, scientists transfer a set of scientific documents to the AI ​​generative model. In previous works, Buehler applied a field of mathematics known as the category theory to help the AI ​​model develop abstraction of scientific concepts as charts, rooted in defining the relationship between components, in a way that can be analyzed using other models through a process called the reasoning of the chart. This focuses AI models on developing a more basic way of understanding concepts; It also allows them to generalize better in various domains.
“It is very important for us to create AI science -oriented models, because scientific theories are usually rooted in generalized principles, and not just resembling knowledge,” says Buehler. “By focusing AI models on” thinking “in such a way, we can jump outside conventional methods and discover more creative applications of AI.”
In the latest article, scientists used about 1000 scientific research on biological materials, but Buehler claims that knowledge charts can be generated using much more or less research in any field.
After determining the chart, scientists developed the AI ​​system for scientific discoveries, with many models specialized in playing specific roles in the system. Most of the components were built from the ChatGPT-4 series models and used the technique known as learning in the context in which the hints provide contextual information about the role of the model in the system, while enabling learning on the basis of provided data.
Individual agents interact with each other to solve a complex problem together, which none of them would be able to do alone. The first task they receive is to generate a research hypothesis. LLM interactions begin after defining the undergrowth on the knowledge chart, which can happen randomly or by manually introducing a pair of keywords discussed in the documents.
As part of the language model, which scientists called “ontologist”, is designed to define scientific terms in articles and examine connections between them, expanding the graph of knowledge. The model called “scientist 1” then develops a test proposal based on factors such as its ability to discover unexpected properties and new products. The proposal includes discussion of potential findings, the impact of research and the supposition on basic mechanisms of action. The “Scientist 2” model expands the idea, suggesting specific experimental and simulation approaches as well as the introduction of other improvements. Finally, the “Krytyka” model emphasizes its strengths and weaknesses and suggests further improvements.
“It's about building a team of experts who don't think in the same way,” says Buehler. “They have to think differently and have different possibilities. A critical agent is intentionally programmed to criticize the others, so you don't have everyone agreeing and saying that it's a great idea. You have an agent who says:” Here is weakness, can you better explain it? ” This means that the output data differ from individual models. “
Other agents in the system are able to search the existing literature, which provides the system not only an assessment of enforceability, but also to create and assess the novelties of each idea.
System maintenance
To confirm their approach, Buehler and Ghafarollahi built a chart of knowledge based on the words “silk” and “energy -saving”. Using the frame, the “Scientist 1” model proposed an integration of silk with pigments based on dandelions to create biomaterials with increased optical and mechanical properties. The model predicted that the material would be much stronger than traditional silk materials and requires less energy for processing.
Then the scientist 2 presented suggestions such as the use of specific molecular dynamic simulation tools to examine how the proposed materials will affect, adding that the good use of the material would be biooinspirged adhesive. The criticism model then emphasized a few strengths of the proposed material and areas of improvement, such as its scalability, long -term stability and impact on the environment of the use of solvents. To solve these fears, the critic suggested conducting pilot tests to validate the process and carry out strict analyzes of material durability.
Researchers also conducted other experiments with randomly selected keywords that developed various original hypotheses on more efficient biomimetic micro -flow systems, increasing the mechanical properties of collagen -based scaffolding and interaction between graphene and amyloid fibers to create bioelectronic devices.
“The system was able to come up with these new, rigorous ideas based on the path from the knowledge chart,” says Ghafarollahi. “In terms of news and possibilities of use, the materials seemed solid and innovative. In future works we will generate thousands or tens of thousands of new research ideas, and then we can categorize them, try to better understand how these materials are generated and how they can be improved.”
In the future, scientists hope to enable new tools for downloading information and starting simulations to their frames. They can also easily mention foundation models within their framework to get more advanced models, enabling the system to adapt to the latest innovations in artificial intelligence.
“Due to the way these agents have influenced, the improvement of one model, even if it is small, has a huge impact on the general behavior and system output,” says Buehler.
Since the publication of Open Source, the researchers have contacted hundreds of people interested in using frames in various scientific fields, and even areas such as finance and cyber security.
“There are many things that you can do without having to go to the laboratory,” says Buehler. “You want to go to the laboratory at the very end of the process. The laboratory is expensive and takes a lot of time, so you want a system that can deeply view the best ideas, formulate the best hypotheses and accurate predictions of rising behavior. Our vision is to facilitate this in use so that you can use the application to introduce other ideas or pull other ideas or stretch in models.”