Improving the way the LLM learns and responds
We are also improving the way LLMs train, learn and respond to users, improving efficiency and effectiveness on several fronts.
Thanks to larger context windows, LLM teachers can now learn from potentially thousands of examples at the same time – this is called multi-step context learning (ICL). This process improves model performance for tasks such as mathematics, translation, and reasoning, but often requires high-quality human-generated data. We are exploring how to make training more cost-effective methods of adapting multi-shot ICL that reduce dependence on manually selected data. There is so much data available for training language models that the main limitation for teams building them becomes available computing power. We raise an important question: given a fixed computational budget, how to choose the right model size to achieve the best results?
Another innovative approach we call Time-reversed language models (TRLM), examines the initial training and tuning of the LLM to operate in reverse. After receiving traditional LLM responses as input, TRLM generates the queries that may have generated these responses. When combined with traditional LLM, this method not only helps ensure that responses better align with user instructions, but also improves citation generation for summarized text and improves filters against harmful content.
Cultivating high-quality data is essential for training large AI models, but manual curation is difficult at scale. To remedy this, our Common selection of examples (JEST) optimizes training by identifying the most learnable data in larger batches, enabling up to 13x fewer training rounds and 10x fewer computations, which outperforms state-of-the-art multimodal pretraining foundations.
Planning tasks present another challenge for artificial intelligence, especially in stochastic environments where results are influenced by randomness or uncertainty. Researchers use different types of reasoning for planning, but there is no consistent approach. We show it Planning itself can be viewed as a distinct type of probabilistic reasoning and propose a framework for ranking different inference techniques based on their planning performance.
















