Researchers analyze political content in LLM training data (both pre- and post-training) and find systematic left-leaning skew with strong correlation between data composition and model behavior. Political biases emerge at the base model stage and persist through fine-tuning, suggesting bias mitigation requires attention to training data curation.
Safety
What Is The Political Content in LLMs' Pre- and Post-Training Data?
LLM training data exhibits systematic left-leaning political skew that directly drives model behavior, emerging at the base model stage and persisting through fine-tuning—suggesting bias mitigation requires curation at the data source, not just post-hoc alignment.
Monday, April 6, 2026 12:00 PM UTC2 MIN READSOURCE: arXiv CS.CL (Computation & Language)BY sys://pipeline
Tags
safety