Developing adaptive AI agents, strengthening 3D scenes and innovative LLM training for a smarter, safer future
Next week, AI researchers around the world will gather for 38. Annual conference on neural information processing systems (Neuroips), takes place on December 10-15 in Vancouver,
Two articles conducted by Google Deepmind researchers will be recognized Time test Awards for “undeniable influence” on the area. Ilya Sutskever will present Sequence for learning sequences with neural networks who co -author of the Vice President of Google Deepmind Dsistictics Research, Oriol Vinyals and the outstanding scientist Quoc V. Le. Google Deepmind Ian Goodfellow and David Ward-Farley scientists will present Generative networks opposite.
We will also show how we translate our basic research into real applications, with live demonstrations, including Gemma Scope, and to generate music, forecast weather and many others.
Google Deepmind teams will present over 100 new articles on topics, from AI agents and generative media to innovative approaches to science.
Building adaptive, intelligent and safe AI agents
AI agents based on LLM promise to perform digital tasks using natural language commands. However, their success depends on precise interaction with complex user interfaces, which requires extensive training data. WITH AndroidcontrolWe provide the most diverse control set, with over 15,000 demonstrations collected by people in over 800 applications. AI agents trained using this set of data have shown a significant increase in performance, which, as we hope, help the development of research on more general AI agents.
In order for AI agents to generalize in tasks, they must learn from any experience they meet. We present the method for Learning abstraction in context This helps agents capture key task patterns and relationships with imperfect returnable versions and feedback in natural language, increasing their efficiency and adaptability.
Frame from the video demonstration of someone who creates a sauce, with individual identified and numbered elements. Ical is able to bring out important aspects of the process
The development of Agentic AI, which works to meet the goals of users, can help to make technology more useful, but adaptation is of key importance when developing artificial intelligence that works on our behalf. To this end, we offer a theoretical method Measure the purposeful direction of the AI systemand also show how The perception of his user model can affect his safety filters. Together, these observations emphasize the importance of solid security in order to prevent unintentional or dangerous behavior, ensuring that AI's actions are in line with safe, intended applications.
Developing the creation and simulation of 3D scenes
Because the demand for high -quality 3D content increases in various industries, such as visual games and effects, the creation of realistic 3D scenes remains expensive and temporary. Our last work introduces new approaches to generating, simulation and 3D control, improving the creation of content for faster, more flexible work flows.
The production of high -quality realistic 3D resources and scenes often requires recording and modeling thousands of 2D photos. We present CAT3DA system that can create a 3D content in a minute, with any number of images – even one image or text prompt. CAT3D achieves this with a multi -tidy diffusion model, which generates additional coherent 2D images from many different points of view and uses these generated images as input data for traditional 3D modeling techniques. The results exceed the previous methods both in terms of speed and quality.
CAT3D allows you to create 3D scenes from any number of generated or real images.
From left to right: text to 3D image, real 3D photo, a few photos for 3D.
Simulating scenes with many rigid objects, such as a cluttered tablet or falling LEGO bricks, also remains intensive computing. To overcome this obstacle, we present New technique called SDF-SIM This represents the shape of the object in a scalable way, accelerating the detection of the collision and enabling efficient simulation of large, complex scenes.
Complex simulation of falling and colliding shoes, thoroughly modeled with SDF-SIM
AI image generators based on diffusion models have difficulty controlling 3D position and orientation of many objects. Our solution, Neuronal assetsIt introduces specific representations for objects that record both appearance and 3D pose, learned through training of dynamic video data. Neuronal assets allow users to move, rotate or change objects in scenes – a useful tool for animation, games and virtual reality.
Considering images of the image and 3D object, we can translate, rotate and jump the object or send objects or backgrounds between images
Improving the way LLM learn and react
We also do how LLM train, learn and react to users, improving performance and performance on several fronts.
Thanks to the larger contextual LLM windows, they can now learn on the basis of potentially thousands of examples at the same time-known as multiple learning in context (ICL). This process increases the performance of the model in tasks such as mathematics, translation and reasoning, but often requires high quality data generated by man. To make the training more profitable, we study Methods of adapting many ICL shots which reduce relying on manually selected data. So many data is available for training language models that the main limitation for their team building is a calculation available. We answer an important question: With a set calculation budget, how to choose the right size of the model to achieve the best results?
Another innovative approach we call Language models reversed temporarily (TRLM), he examines the initial and Financing LLM to the other way around. After providing traditional answers, LLM as input data TRLM generates queries that could bring these answers. In combination with the traditional LLM, this method not only helps to provide response to user manuals better, but also improves the generation of quotes of summarized text and improves safety filters in relation to harmful content.
High quality data is necessary for training large AI models, but manual treatment is difficult on a large scale. To solve this problem, ours A shared example of a choice (There is) the algorithm optimizes training, identifying the most learning data in larger parts, enabling up to 13 × less training rounds and 10 × less calculations, which exceeding the latest, multimodal basic basic basis.
Planning tasks are another challenge for AI, especially in stochastic environments, in which the results are influenced by randomness or uncertainty. Scientists use various types of application for planning, but there is no consistent approach. We show it Planning itself can be seen as a separate type of probabilistic inference And propose the ranking of various application techniques based on their effectiveness of planning.
Connecting the global AI community
We are proud that we are a sponsor of a diamond conference and support Machine learning womenIN Latinx in AI AND Black in AI In building communities around the world working in artificial intelligence, machine learning and data learning.
If you are above this year in neuroips, google Deepmind and Google Research He stands to examine the latest research on demonstrations, workshops and not only during the conference.