BREAKING
8h agoAmazon Earnings, Trainium and Commodity Markets, Additional Amazon Notes///8h agoWomen sue the men who used their Instagram feed to create AI porn influencers///8h agoFast16 Malware///8h agoAmazon Earnings, Trainium and Commodity Markets, Additional Amazon Notes///8h agoWomen sue the men who used their Instagram feed to create AI porn influencers///8h agoFast16 Malware///
BACK TO GLOSSARY
STDStandardsModels

SWE-bench

7 mentions across all digests

SWE-bench is a benchmark for evaluating AI models on real-world software engineering tasks from GitHub repositories, which OpenAI has stopped using due to concerns including saturation and overfitting, prompting a shift toward newer evaluations like SWE-Lancer.

/// Stats
First Seen2026-03-24
Last Seen2026-04-15
Total Mentions7
Last 7 Days0
Sources5
Peak Relevance5/5
Active Predictions1
/// Connected Entities