Running Local LLMs Offline on a Ten-Hour Flight

A developer documented running open-source LLMs (Gemma 4 31B and Qwen 4.6 36B) locally on a MacBook Pro M5 Max during a 10-hour flight without connectivity. They built a custom billing analytics tool with DuckDB and processed 4M tokens on various tasks, finding local models produce comparable output to frontier models for narrow-scope work. The experience revealed hard limits: ~1% battery drain per minute under load, thermal constraints at 70-80W, and context window degradation past 100k tokens.