In the old days – really old days – the task of designing materials was laborious. Investigators, in over 1000 years, tried to make gold, combining things such as lead, mercury and sulfur, mixed in what they hoped they would be the right proportions. Even well -known scientists, such as Tycho Brahe, Robert Boyle and Isaac Newton, tried their hands in a fruitless undertaking, which we call alchemy.
Of course, scientific materials have gone through a long way. Over the past 150 years, scientists have benefit from the periodic table of elements on which you can draw, which tells them that different elements have different properties and cannot be magically transformed into others. In addition, over the past decade, machine learning tools have significantly increased our ability to determine the structure and physical properties of various molecules and substances. New research conducted by a group conducted by Ju Li – a professor of nuclear engineering at Tokyo Electric Energy in MIT and professor of material materials and engineering – offer the promise of a serious leap of possibilities that can facilitate the design of materials. The results of their investigation are submitted in the issue of December 2024. Nature computational science.
Currently, most machine learning models, which are used to characterize molecular systems, are based on functional density theory (DFT), which offers a mechanical mechanical approach to determine the total energy of the molecule or crystal by looking at the distribution of electron density-which is essentially an average number of electrons placed in the unit around the space near the space near the space near the space near the space near the space near the space near the space near the space near the space molecules. (Walter Kohn, who coinned this theory 60 years ago, received the Nobel Prize in the field of chemistry for her in 1998)
“Parce therapy” to the rescue
His team currently consists of a different technique of computing chemistry, also from quantum mechanics, known as the theory of conjugated cluster or CCSD (T). “This is the golden standard of quantum chemistry,” li comments. The results of CCSD (T) calculations are much more accurate than what you get from DFT calculations and can be as trustworthy as those that are currently possible from experiments. The problem is that carrying out these calculations on the computer is very slow, says: “And scaling is bad: if you double the number of electrons in the system, the calculations become 100 times more expensive.” For this reason, CCSD (T) calculations were usually limited to molecules with a small number of atoms – in a row about 10. Something more time would take too long.
This is where machine learning appears. CCSD (T) calculations are first performed on conventional computers, and then the results are used to train a neural network with innovative architecture specially developed by Li and its colleagues. After training, the neural network can perform the same calculations much faster, using the approximation techniques. What's more, their neural network model can get much more information about the molecule than just its energy. “In previous works, people used many different models to evaluate various real estate,” says Hao Tang, PhD student in the field of materials and engineering. “Here we use only one model to evaluate all these properties, which is why we call it a” multi -purpose “approach.
“Multi-purpose electronic Hamilton Network” or Mehnet sheds light on a series of electronic properties, such as dipole and quadrupal moments, electronic polarizability and optical excitation difference-the amount of energy needed to transfer the electron from the primary state to the lowest excited state. “The excitation difference affects the optical properties of materials,” explains Tang, “because it determines the frequency of light that can be absorbed by a molecule.” Another advantage of their model trained CCSD is that it can reveal the properties of not only basic states, but also excited states. The model can also predict a spectrum of infrared molecule absorption associated with its vibrational properties, in which the vibrations of atoms in the molecule are closed with each other, which leads to various collective behavior.
The strength of their approach owes a lot to network architecture. Based on the work of Professor MIT Tess SmidtThe team uses the so-called e (3) neural network-says Tang: “In which the nodes represent atoms and edges connecting the nodes representing bonds between atoms. We also use non-standard algorithms that include physics principles-related to how people calculate the particle properties in quantum mechanics-indirectly to our model.”
Testing, 1, 2 3
After testing the analysis of well -known hydrocarbon molecules, the Li et al. They exceeded DFT counterparts and strictly adapted experimental results taken from published literature.
Qiang Zhu – a specialist in discovering materials at the University of North Carolina in Charlotte (which was not part of this study) – is impressed by what has been achieved so far. “Their method enables effective training with a small set of data, while achieving excellent accuracy and calculation performance compared to existing models,” he says. “This is an exciting work that illustrates the powerful synergy between computing chemistry and deep learning, offering new ideas for developing more accurate and scalable methods of electronic structure.”
The group based on the myth applied its model first for small, non-metallic elements of the water, carbon, nitrogen, oxygen and fluorine, from which organic compounds can be produced-and from that time it has moved to the examination of heavier elements: silicon, phosphorus, sulfur, chlorine, chloran, and even platinum. After training on small molecules, the model can be generalized to larger and larger particles. “Earlier, most calculations were limited to the analysis of hundreds of atoms from DFT and only dozens of atoms with CCSD (T) calculations,” says Li. “Now we are talking about the service of thousands of atoms, and maybe tens of thousands.”
For now, scientists are still assessing known molecules, but the model can be used to characterize molecules that have not been seen before, as well as to predict the hypothetical properties of materials that consist of different types of molecules. “The point is to use our theoretical tools to choose promising candidates that meet a specific set of criteria before suggesting them to experimentalists to check,” says Tang.
It's about applications
Looking to the future, ZHU is an optimist about possible applications. “This approach has the potential for high -pass molecular screening,” he says. “This is a task in which achieving chemical accuracy may be necessary to identify new molecules and materials with the desired properties.”
After demonstrating the ability to analyze large molecules with perhaps tens of thousands of atoms, he says: “We should be able to come up with new polymers or materials” that can be used in the design of medicines or in semiconductor devices. Examination of heavier metal transitional elements can lead to the arrival of new batteries – currently an area with sharp needs.
The future, as he sees it, is widely open. “It's not about one area anymore,” he says. “Our ambition is ultimately covering the entire periodic table with the accuracy of the CCSD (T) level, but at lower calculation costs than DFT. This should enable us to solve a wide range of chemistry, biology and material materials. It is difficult to know, in how wide this range can be.”
This work was supported by the Honda Research Institute. Hao Tang confirms support from the MathWorks Engineering scholarship. Calculations in these works were carried out partly on the fast universal Atomist simulator Matlantis, Texas Advanced Computing Center, Myth Supercloud and National Energy Scientific Computing.