CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction
Published in ICML 2026, 2026
The paper CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction has been accepted by ICML 2026.

Authors: Yinghao Ma, Haiwen Xia, Hewei Gao, Weixiong Chen, Yuxin Ye, Yuchen Yang, Sungkyun Chang, Mingshuo Ding, Yizhi Li, Ruibin Yuan, Simon Dixon, Emmanouil Benetos
- Introduced a new task for evaluating generated audio from compositional music instructions.
- Created CMI-Pref using 23 music generation models, with an LLM-labeled dataset of 110k preferences and 797 hours of generated audio, plus a human-labeled subset with 4027 preference labels.
- Built CMI-RewardBench by integrating prior works to further evaluate preference learning.
- Trained a parameter-efficient reward model that achieves near-SOTA correlation with human labels across multiple evaluation protocols, demonstrating the effectiveness of CMI-Pref and enabling inference-time scaling for music generation models.
