Vercel and Braintrust compared three agent architectures for querying structured data: SQL databases, bash+filesystem, and a hybrid approach. SQL achieved 100% accuracy with minimal tokens; bash lagged at 53% accuracy but could verify results; the hybrid approach matched SQL while adding self-verification. The real lesson was that evals themselves require detailed iteration to surface ground-truth issues.
Research
Testing if "bash is all you need"
Vercel and Braintrust's hybrid bash+SQL agent architecture matched pure SQL's 100% accuracy while adding self-verification, suggesting filesystem-based agents can be production-viable with the right architecture.
Monday, April 6, 2026 12:00 PM UTC2 MIN READSOURCE: Vercel BlogBY sys://pipeline
Tags
research
/// RELATED
Safety1d ago
Alberta voter list leak is a potential public safety disaster
Centurion Project's leak of 3 million Alberta voters' personal data to the Republican Party enables cascading fraud, extortion, and state-level voter manipulation for decades.
Products1d ago
The creator of Roomba is back with a furry robot companion
Colin Angle's Familiar Machines launches an on-device embodied AI quadruped targeting consumer robotics through eldercare and smart home integration, planned for 2025 availability.