Constitutional AI

Constitutional AI is a training approach developed by Anthropic to make AI systems more reliably helpful and harmless. Rather than relying solely on human feedback to reinforce desired behaviors, Constitutional AI gives the model a set of principles — a "constitution" — and trains it to critique and revise its own outputs based on those principles. The approach is designed to produce AI that is more consistently aligned with human values and less dependent on the volume of human labeling.

Claude is trained using Constitutional AI, which is part of why it tends to handle ethically nuanced situations with more care than models trained on different approaches.

Related terms

Explore more terms