Google DeepMind released Gemma 4, a family of four Apache 2.0-licensed vision-capable models (E2B, E4B, 31B, 26B-A4B MoE) with multimodal support for images, video, and audio. The smaller models use Per-Layer Embeddings (PLE) to maximize parameter efficiency, branding them as "Effective" parameter counts. Simon Willison confirmed local runs via LM Studio GGUFs worked for all sizes except 31B.
Models
Gemma 4: Byte for byte, the most capable open models
Google DeepMind released Gemma 4, a family of four Apache 2.0-licensed multimodal models (up to 31B parameters) with optimized parameter efficiency through Per-Layer Embeddings, supporting images, video, and audio.
Friday, April 3, 2026 12:00 PM UTC2 MIN READSOURCE: Simon WillisonBY sys://pipeline
Tags
models
/// RELATED
Products4d ago
Introducing Dynamic Workflows: durable execution that follows the tenant
Cloudflare Workers now supports durable, dynamic code execution on multi-tenant platforms — enabling AI agents and CI/CD systems to safely run versioned, isolated workloads without rebuilds.
War4d ago
Elon Musk had a bad week in court
Musk's testimony in his lawsuit against OpenAI unraveled in court as he contradicted himself and argued with counsel, potentially crippling his case to reclaim control of the nonprofit.