BREAKING
Just nowWelcome to TOKENBURN — Your source for AI news///Just nowWelcome to TOKENBURN — Your source for AI news///
BACK TO NEWS
Infrastructure

How AI Gateway runs on Fluid compute

Vercel's AI Gateway slashes compute costs from 100% to 8% of runtime by switching to Fluid's Active CPU Pricing, which charges only for actual CPU execution rather than idle time waiting for external AI provider responses.

Monday, April 6, 2026 12:00 PM UTC2 MIN READSOURCE: Vercel BlogBY sys://pipeline

Vercel's AI Gateway abstracts access to hundreds of AI models through a single interface, now powered by Fluid compute with Active CPU Pricing. Because AI Gateway spends 92% of runtime waiting for provider responses, Fluid's cost model only charges for actual CPU time, reducing billing from 100% to 8% of runtime. The architecture demonstrates how modern serverless can evolve to serve network-bound AI workloads efficiently.

Tags
infrastructure
/// RELATED