A developer documented running open-source LLMs (Gemma 4 31B and Qwen 4.6 36B) locally on a MacBook Pro M5 Max during a 10-hour flight without connectivity. They built a custom billing analytics tool with DuckDB and processed 4M tokens on various tasks, finding local models produce comparable output to frontier models for narrow-scope work. The experience revealed hard limits: ~1% battery drain per minute under load, thermal constraints at 70-80W, and context window degradation past 100k tokens.
Models
Running Local LLMs Offline on a Ten-Hour Flight
Running Gemma 4 31B and Qwen 4.6 36B locally on an M5 Max shows open-source LLMs match frontier model quality for narrow tasks, but hit hard thermal (70-80W) and battery (1%/min drain) limits in offline scenarios.
Monday, April 27, 2026 12:00 PM UTC2 MIN READSOURCE: Hacker NewsBY sys://pipeline
Tags
models