code generation
5 mentions across all digests
Code generation is the automatic production of source code by AI models, evaluated through benchmarks like ACES and ReCUBE that measure test suite robustness and repository-level context utilization respectively.
ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation
ACES metric reveals fragile test suites in code generation benchmarks by measuring whether scores hold up when individual test cases are removed.
Embarrassingly simple self-distillation improves code generation
Self-distillation emerges as a deceptively simple technique that meaningfully improves code generation quality in existing language models without requiring model retraining.
ReCUBE: Evaluating Repository-Level Context Utilization in Code Generation
ReCUBE introduces a benchmark measuring how well code generation models leverage full-repository context versus isolated snippets, critical for evaluating AI coding assistants' real-world effectiveness.
Training code generation models to debug their own outputs
Amazon trains code generation models to self-debug using supervised fine-tuning and reinforcement learning, improving both initial outputs and iterative error correction—a breakthrough for agentic coding systems.
Making a web app generator with open ML models