Interpretable Reward Modeling with Active Concept Bottlenecks

Jul 18, 2025ยท
Sonia Laguna
Sonia Laguna
,
Katarzyna Kobalczyk
,
Julia E. Vogt
,
Mihaela Van Der Schaar
ยท 0 min read
Type
Publication
In ICML 2025 Workshop PRAL