We expand our risk domains and improve our risk assessment process.
The AI breakthrough transforms our daily lives, from the development of mathematics, biology and astronomy to realizing the potential of personalized education. When we are building more and more strong AI models, we commit ourselves to respond responsible for our technologies and accepting the approach to the approach to the output due to the emergence of risk.
Today we publish the third iteration of ours Frontier Safety (FSF) – Our most comprehensive approach to identification and alleviating serious risk from advanced AI models.
This update is based on our ongoing cooperation with experts in the industry, the academic environment and the government. We also took into account the conclusions drawn from the implementation of previous versions and the evolution of the best practices in the AI Safety Frontier.
Key framework updates
Risk solution to harmful manipulation
Thanks to this update, we introduce a critical level of ability (CCL)* focuses on harmful manipulation – in particular AI models with powerful manipulative possibilities that may be improperly used for systematic and fundamental change of beliefs and behaviors in identified contexts of high rates in the course scale.
This add -on is based on the research that we conducted to identify and evaluate Mechanisms that drive manipulation from generative AI. Going further, we will continue to invest in this domain to better understand and measure the risk associated with harmful manipulation.
Adapting our approach to the risk of non -social
We have also expanded our frames to deal with potential future scenarios in which unprofessional AI models can interfere with the ability of operators to manage, modify or close their operations.
While our previous version of Framework included an exploration approach focused on CCL instrumental reasoning (i.e. warning levels specific to when the AI model begins to think deceptively), thanks to this update we now provide further protocols of our research and machine development CCLS, which focus on models that could speed up AI research and development potentially evaluate the development and development of development and development Levels.
In addition to the risk of improper use resulting from these possibilities, there is also a risk of non -social relief resulting from the potential of the model for unspecified activities at these levels of ability and the likely integration of such models with AI development and implementation processes.
To deal with the risk posed by CCLS, we review security matters before external launch, when the appropriate CCL was reached. This includes performing detailed analyzes showing how the risk has been reduced to levels that can be managed. In the case of advanced research and development of Machine learning, CCLS on a large scale internal implementation can also be a risk, which is why we are expanding this approach to such implementation.
Sharpening our risk assessment process
Our frames are aimed at solving a risk proportional to their severity. We have entered our CCL definitions specifically to identify critical threats that justify the most stringent management and mitigation strategies. We still use security and safety before achieving specific CCL thresholds and as part of our standard approach to the development of the model.
Finally, in this update, we describe our risk assessment process in more detail. Based on our basic assessments of early prudence, we describe how we carry out holistic assessments that include systematic risk identification, comprehensive model analysis of model abilities and a clear determination of risk admissibility.
Proceedings of our commitment to border security
This latest update of our Frontier security framework represents our continuous commitment to the scientific approach and based on evidence to track and overtake the risk of AI, because the possibilities promoted towards Aga. By expanding our risk domains and strengthening our risk assessment processes, we try to ensure that the transformation AI brings the benefits of humanity, while minimizing potential damage.
Our frames will continue to evolve based on new research, contribution of interested parties and implementation conclusions. We are involved in cooperation in the industry, the academic environment and the government.
The path to favorable Aga requires not only technical breakthroughs, but also a solid frames to reduce risk on the way. We hope that our updated front safety frames significantly contribute to this joint effort.


















