Research paper proposes Curiosity-Critic, a new method for training world models using cumulative prediction error improvement as an intrinsic reward signal. The approach provides a more tractable alternative to standard curiosity mechanisms in reinforcement learning, addressing efficiency in world model development.
Research
Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training
New intrinsic reward mechanism using cumulative prediction error replaces expensive curiosity signals to improve world model training efficiency.
Wednesday, April 22, 2026 12:00 PM UTC2 MIN READSOURCE: arXiv CS.LG (Machine Learning)BY sys://pipeline
Tags
research
/// RELATED