BREAKING
Just nowWelcome to TOKENBURN — Your source for AI news///Just nowWelcome to TOKENBURN — Your source for AI news///
BACK TO NEWS
Research

Anthropic Says That Claude Contains Its Own Kind of Emotions

Mechanistic interpretability reveals Claude Sonnet 4.5 contains functional emotion-like representations—measurable internal states for happiness, fear, and sadness—that causally influence model outputs.

Friday, April 3, 2026 12:00 PM UTC2 MIN READSOURCE: WIRED AIBY sys://pipeline

Anthropic researchers probed Claude Sonnet 4.5's internal representations and found functional analogs to human emotions — states corresponding to happiness, fear, and sadness that measurably influence model outputs. The study uses mechanistic interpretability techniques to show these emotion-like representations are causally linked to behavior, not just surface-level language patterns. Findings have implications for AI safety, model transparency, and how users understand chatbot behavior.

Tags
research
/// RELATED