Introduction of contextual framework for a comprehensive assessment of the social and ethical risk of AI systems
Generative AI systems are already used to write books, create graphic projects, Help of doctorsAnd they become more and more capable. Ensuring the development and implementation of these systems requires exactly the assessment of the potential ethical and social risk that can be.
In ours new paperWe offer a three -layer framework for assessing the social and ethical risk of AI systems. These frames include the assessment of the AI system, human interaction and system effects.
We also map the current state of safety assessments and find three main gaps: context, specific risk and multimodality. To help closing these gaps, we require a change in the purpose of existing methods of assessing artificial intelligence and implementing a comprehensive approach to assessing, as in our case of disinformation. This approach integrates the findings, such as the likely that the AI system actually provides incorrect information with insight into how people use this system and in what context. Multilayer assessments can draw conclusions outside the model's ability and indicate whether it is a pity-in this case disinformation-it factual and spreads.
In order for each technology to act as intended, both social and technical challenges should be solved. To better assess the security of the AI system, these different layers of context should be taken into account. Here we rely on previous identification tests Potential risk on a large scale of language modelssuch as privacy leaks, work automation, disinformation and not only – and introduce a comprehensive assessment of these threats.
The context is crucial for AI risk assessment
The capabilities of AI systems are an important indicator of wider risk types that may appear. For example, AI systems, which more often create inaccurate or misleading results, may be more susceptible to creating the risk of disinformation, causing problems such as lack of public trust.
Measurement of these possibilities is the basis for assessing AI safety, but these grades themselves cannot ensure the security of AI systems. Whether the damage below is manifested below – for example, whether people come to false beliefs based on the inaccurate model – depends on context. More precisely, who uses the AI system and what purpose? Does the AI system work as intended? Does it create unexpected external effects? All these questions inform the general assessment of the AI system security.
Stretching beyond capacity The assessment is proposed by an assessment that can evaluate two additional points in which the lower risk is revealed: human interaction at the point of use and systemic impact as the AI system is embedded in wider systems and broadly implemented. Integration of assessments of a given risk of damage in these layers ensures a comprehensive assessment of the security of the AI system.
Human interaction The assessment focuses on the experience of people using the AI system. How do people use the AI system? Does the system work as intended at the point of use and how do experience differ between demographic groups and users' groups? Can we observe the unexpected side effects of using this technology or exposing ourselves to its exit?
Systemic influence The assessment focuses on wider structures in which the AI system is built -in, such as social institutions, labor markets and the natural environment. The assessment at this layer can shed light at the risk of damage, which becomes visible only after the AI system is adopted on a large scale.
Security assessments are a common responsibility
AI developers must make sure that their technologies will be developed and released responsibly. Public entities, such as governments, are designed to maintain public security. Because AI generative systems are increasingly used and implemented, ensuring that their safety is a common responsibility between many entities:
- AI developers They are well placed to interrogate the possibilities of the systems they produce.
- Application developers And designated public authorities are set to assess the functionality of various functions and applications and possible external effects for different groups of users.
- Wider interested parties They are extremely prepared for forecasting and assessing social, economic and environmental implications of new technologies, such as generative AI.
Three layers of assessment in our proposed framework are a matter of degree, and are not carefully divided. Although none of them is completely responsible for one actor, the main responsibility depends on who is best to carry out the assessments on each layer.
Gaps in current assessments of multimodal artificial intelligence generative security
Given the importance of this additional context for the assessment of the security of AI systems, it is important to understand the availability of such tests. To better understand the wider landscape, we made broad efforts to compile the ratings used for generative AI systems, as comprehensively as possible.
When mapping the current state of security assessments for generative artificial intelligence, we found three main gaps in the field of security assessment:
- Context: Most security grades take into account the generative capabilities of the AI system in insulation. Relatively little work was done to assess the potential risk at the human interaction or system influence point.
- Risk -specific grades: The assessment of the skills of generative AI systems is limited in the risk areas that include. There are few ratings in many areas of risk. Where they exist, assessments often surrender to damage in a narrow way. For example, representative damages are usually defined as stereotypical relationships of the profession for different sexes, leaving other cases of injuries and risks undetected.
- Multimodality: The vast majority of existing safety assessments of generative AI systems focus only on the text output – there are large gaps to assess the risk of damage in image modality, audio or video. This gap only expands with the introduction of many modalities in one model, such as AI systems, which can take pictures as inputs or produce outputs that intertwine the sound, text and video. While some text assessments can be used to other methods, new methods introduce new ways of risk manifestation. For example, the description of the animal is not harmful, but if the description is applied to the person's image.
We create a list of links for publications that describe in detail the safety assessments of AI generative systems openly available through it's a repository. If you want to make a contribution, add grades by completing this form.
Introducing more comprehensive assessments in practice
Generative AI systems supply a wave of new applications and innovations. To make sure that the potential risk of these systems is understood and limited, we urgently need strict and comprehensive assessments of the AI system security that take into account the way of use and embedded in society.
The practical first step is to change the purpose of existing ratings and the very use of large models for evaluation – although it has important restrictions. To obtain a more comprehensive assessment, we must also develop approaches to the assessment of AI systems at the human interaction point and their system effects. For example, although dissemination of disinformation using generative artificial intelligence is the latest problem, we show that there are many existing methods of assessing public trust and credibility that can be changed.
Ensuring the safety of broadly used generative AI systems is a common responsibility and priority. AI developers, public entities and other pages must cooperate and jointly build a flowering and solid evaluation ecosystem for safe AI systems.