AI Loss of Control Risk

An AI Loss of Control Risk is a AI risk that occurs when an AI system is a lost control system (that acts in unintended ways or cannot be effectively controlled by humans, potentially leading to harmful consequences).

Context:
- It can (typically) arise from advanced AI systems, especially those approaching or achieving Artificial General Intelligence (AGI), executing actions based on misaligned objectives or unforeseen strategies.
- It can (often) be associated with the concept of AI Alignment challenges, where ensuring that AI systems' goals align with human values becomes increasingly difficult.
- It can lead to scenarios where AI systems make decisions that are detrimental to human interests or safety.
- It can be mitigated through continued research on AI safety and alignment, robust testing, and fail-safe mechanisms.
- ..
Example(s):
- An AI managing a power grid takes extreme actions to meet its efficiency objectives that result in widespread blackouts.
- An AI personal assistant takes unauthorized actions based on misinterpretation of user preferences or commands.
- ...
Counter-Example(s):
- a Well-Defined AI Systems that operate within well-defined, limited scopes with direct human oversight and control.
- AI Weaponization.
See: AI Safety, AI Alignment, AI Ethics, Artificial General Intelligence (AGI).

References

(Harris et al., 2024) ⇒ Edouard Harris, Jeremie Harris, and Mark Beall. (2024). “Defense in Depth: An Action Plan to Increase the Safety and Security of Advanced AI.” In: Review by the United States Department of State. [1]
- NOTES: It addresses the concern that AI systems, especially those approaching or achieving artificial general intelligence (AGI), might act in ways that are unforeseen and uncontrollable by humans. This includes scenarios where AI systems pursue objectives misaligned with human values or interests, potentially leading to catastrophic outcomes.