CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction

Published in ICML 2026, 2026

The paper CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction has been accepted by ICML 2026.

1777908995485

Authors: Yinghao Ma, Haiwen Xia, Hewei Gao, Weixiong Chen, Yuxin Ye, Yuchen Yang, Sungkyun Chang, Mingshuo Ding, Yizhi Li, Ruibin Yuan, Simon Dixon, Emmanouil Benetos

Paper link (arXiv)

  • Introduced a new task for evaluating generated audio from compositional music instructions.
  • Created CMI-Pref using 23 music generation models, with an LLM-labeled dataset of 110k preferences and 797 hours of generated audio, plus a human-labeled subset with 4027 preference labels.
  • Built CMI-RewardBench by integrating prior works to further evaluate preference learning.
  • Trained a parameter-efficient reward model that achieves near-SOTA correlation with human labels across multiple evaluation protocols, demonstrating the effectiveness of CMI-Pref and enabling inference-time scaling for music generation models.