Applied AI & Compilers @ MPS Lab
arXiv: 2505.06279
We present an interpretability framework for unsupervised reinforcement learning (URL) agents, aimed at understanding how intrinsic motivation shapes attention, behavior, and representation learning. Our findings show that curiosity-driven agents display broader, more dynamic attention and exploratory behavior than their extrinsically motivated counterparts.
Fine-tune Llama 3.2 1B on consumer hardware using uv. Includes rigorous benchmarking of training throughput and memory footprint.
A comprehensive interpretability framework for understanding how curiosity-driven agents explore and represent the world in unsupervised RL settings.
Minimal implementation of the Mamba state space model.