AI Safety

AI safety is the field of research and practice focused on ensuring AI systems behave as intended, don't cause unintended harm, and remain under appropriate human control. At the organizational level, AI safety concerns include: ensuring AI outputs are reviewed before consequential decisions, maintaining human oversight of automated processes, handling AI errors without amplifying their impact, and designing systems that fail safely. At the broader societal level, AI safety research addresses questions about how to develop increasingly powerful AI systems without creating risks that are difficult to reverse.

Related terms

Explore more terms