Safety

How catastrophic is your LLM?

Amazon and University of Illinois researchers developed C3LLM, a graph-based framework for evaluating catastrophic LLM risks in multi-turn conversations, revealing DeepSeek-R1 has 70%+ certified vulnerability to cybercrime attacks while Claude-Sonnet-4 shows stronger defenses.

Monday, April 27, 2026 12:00 PM UTC2 MIN READSOURCE: Amazon ScienceBY sys://pipeline

Amazon Science and University of Illinois researchers introduced C3LLM, a framework for evaluating catastrophic risks in LLMs during multi-turn conversations. Unlike traditional red-teaming that assesses isolated prompts, C3LLM uses graph-based modeling of conversational flows to generate statistical probability bounds on attack success rates. Testing on frontier models revealed significant safety variance, with DeepSeek-R1 showing 70%+ certified risk in cybercrime scenarios, while Claude-Sonnet-4 and Nova Premier demonstrated lower risk levels.

Read original at Amazon Science

Yet another experiment proves it's too damn simple to poison large language models

A security researcher poisoned multiple search-backed LLMs with fabricated Wikipedia and website entries about a fake 2025 championship, demonstrating trivial RAG-layer exploitation that exposes how easily AI systems fail to verify source credibility.