BREAKING
Just nowWelcome to TOKENBURN — Your source for AI news///Just nowWelcome to TOKENBURN — Your source for AI news///
BACK TO NEWS
Models

Running Local LLMs Offline on a Ten-Hour Flight

Running Gemma 4 31B and Qwen 4.6 36B locally on an M5 Max shows open-source LLMs match frontier model quality for narrow tasks, but hit hard thermal (70-80W) and battery (1%/min drain) limits in offline scenarios.

Monday, April 27, 2026 12:00 PM UTC2 MIN READSOURCE: Hacker NewsBY sys://pipeline

A developer documented running open-source LLMs (Gemma 4 31B and Qwen 4.6 36B) locally on a MacBook Pro M5 Max during a 10-hour flight without connectivity. They built a custom billing analytics tool with DuckDB and processed 4M tokens on various tasks, finding local models produce comparable output to frontier models for narrow-scope work. The experience revealed hard limits: ~1% battery drain per minute under load, thermal constraints at 70-80W, and context window degradation past 100k tokens.

Tags
models