Claude Sonnet 4.6
6 mentions across all digests
Claude Sonnet 4.6 is an Anthropic model featuring a 1-million-token context window that set new SWE-Bench and OS World benchmark records, serving as the default model for Free and Pro tiers with strong coding, instruction-following, and agentic task performance.
Last Week in AI #336 - Sonnet 4.6, Gemini 3.1 Pro, Anthropic vs Pentagon
Claude's Sonnet 4.6 debuts as the free/pro default with 1M context and SWE-Bench wins, but Gemini 3.1 Pro edges ahead on frontier evals (77% ARC-AGI vs Opus's 69%), while Anthropic faces Pentagon pressure over refusing fully autonomous lethal weapons deployment.
Talkie: a 13B vintage language model from 1930
Researchers trained a 13B language model exclusively on pre-1931 text to investigate how historical data shapes model knowledge and temporal prediction capability, with a Claude Sonnet-powered demo.
Cloudflare can remember it for you wholesale
Cloudflare's Agent Memory service lets AI agents offload conversation context, recovering the 10-20% of token space currently wasted on system prompts and tools, enabling more efficient use of limited context windows.
The Boy That Cried Mythos: Verification is Collapsing Trust in Anthropic
Anthropic's Claude Mythos security verification overstates results: the flagship Firefox demo tested patched containers with pre-discovered bugs, and real code-execution rates collapse from 72.4% to 4.4% when key exploitable vulnerabilities are removed.
Import AI 452: Scaling laws for cyberwar; rising tides of AI automation; and a puzzle over gDP forecasting
AI cyberattack capabilities scale exponentially—Claude Opus 4.6 achieves 50% success on expert-level tasks with performance doubling every 5–7 months, while open models rapidly close the gap to proprietary systems.