VALOR: Variational option discovery using VAE-inspired curriculum learning
AI Impact Summary
Variational option discovery is an emerging reinforcement learning technique for automatically discovering reusable behavioral primitives (options) within a single agent. This work introduces VALOR, which leverages variational autoencoders to learn options by encoding environmental contexts into trajectories and reconstructing contexts from complete trajectories—creating a tighter theoretical connection between VAE principles and hierarchical RL. The curriculum learning contribution (dynamically increasing context complexity as decoder performance improves) addresses a known training instability in variational option methods, enabling agents to learn significantly more diverse behaviors from a single training run. This has direct implications for sample efficiency and generalization in multi-task and transfer learning scenarios.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- medium