Models

WAND: Windowed Attention and Knowledge Distillation for Efficient Autoregressive Text-to-Speech Models

Windowed attention and knowledge distillation enable faster, cheaper autoregressive text-to-speech synthesis without quality loss.

Monday, April 13, 2026 12:00 PM UTC2 MIN READSOURCE: arXiv CS.CL (Computation & Language)BY sys://pipeline

WAND combines windowed attention and knowledge distillation to improve efficiency in autoregressive text-to-speech models. The technique aims to reduce computational cost and latency while maintaining synthesis quality.

Read original at arXiv CS.CL (Computation & Language)