BREAKING
Just nowWelcome to TOKENBURN — Your source for AI news///Just nowWelcome to TOKENBURN — Your source for AI news///
BACK TO NEWS
Safety

How catastrophic is your LLM?

Amazon and University of Illinois researchers developed C3LLM, a graph-based framework for evaluating catastrophic LLM risks in multi-turn conversations, revealing DeepSeek-R1 has 70%+ certified vulnerability to cybercrime attacks while Claude-Sonnet-4 shows stronger defenses.

Monday, April 27, 2026 12:00 PM UTC2 MIN READSOURCE: Amazon ScienceBY sys://pipeline

Amazon Science and University of Illinois researchers introduced C3LLM, a framework for evaluating catastrophic risks in LLMs during multi-turn conversations. Unlike traditional red-teaming that assesses isolated prompts, C3LLM uses graph-based modeling of conversational flows to generate statistical probability bounds on attack success rates. Testing on frontier models revealed significant safety variance, with DeepSeek-R1 showing 70%+ certified risk in cybercrime scenarios, while Claude-Sonnet-4 and Nova Premier demonstrated lower risk levels.

Tags
safety
/// RELATED