Scientists from myth and technion, Israel Institute of Technology, developed Innovative algorithm This can revolutionize the way machines are trained to deal with uncertain situations in the real world. Inspired by the process of learning people, the algorithm dynamically determines when the machine should imitate the “teacher” (known as learning to imitate) and when to discover and learn through test and errors (known as learning to strengthen).
The key idea of the algorithm is to achieve a balance between two learning methods. Instead of relying on a brutal trial and error or permanent combination of imitation and strengthening, scientists trained two student machines at the same time. One student used the weighted combination of both learning methods, while the second student consisted only on learning to strengthen.
The algorithm constantly compares the performance of two students. If a student using the teacher's instructions achieved better results, the algorithm increased the importance of learning to imitate for training. And vice versa, if the student consisting in trials and errors showed promising progress, the algorithm focused more on learning to strengthen. Thanks to the dynamic adjustment of the learning approach based on performance, the algorithm turned out to be adaptive and more effective in teaching complex tasks.
In simulated experiments, scientists tested their approach by training machines to navigate labyrinths and manipulate objects. The algorithm showed almost ideal indicators of success and exceeded the methods that only used imitation or learning to strengthen. The results were promising and showed the potential of the algorithm for training machines for questioning scenarios in the real world, such as robot navigation in unknown environments.
Pulkit Agrawal, director of the unbelievable AI laboratory and assistant professor at the Computer Science Laboratory and Artificial Intelligence, emphasized the algorithm's ability to solve difficult tasks with which previous methods struggled. Scientists believe that this approach can lead to the development of the highest quality robots capable of submitted objects and movement.
In addition, algorithm applications go beyond robotics. This can improve performance in various fields that use imitation or learning to learn. For example, it can be used to train smaller language models by using knowledge about larger models for specific tasks. Scientists are also interested in studying similarities and differences between machine learning and human teaching from teachers to improve general educational experience.
Experts not involved in the research expressed enthusiasm for the solidity of the algorithm and its promising results in various fields. They emphasized the potential of its use in areas covering memory, reasoning and tactile detection. The algorithm's ability to use earlier computing work and simplify the balancing of learning goals makes it exciting progress in the field of reinforcement learning.
As the research continues, this algorithm can pave the way for more efficient and flexible machine learning systems, bringing us closer to the development of advanced AI technologies.
Learn more about research in paper.