Publications

News

ICML 2026
CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction

CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction

Yinghao Ma*, Haiwen Xia*, Hewei Gao, Weixiong Chen, Yuxin Ye, Yuchen Yang, Sungkyun Chang, Mingshuo Ding, Yizhi Li, Ruibin Yuan, Simon Dixon, Emmanouil Benetos

  • Introduced a new task for evaluating generated audio from compositional music instructions.
  • Created the CMI-Pref dataset using 23 music generation models, with 110k LLM-labeled preferences, 797 hours of generated audio, and a human-labeled subset containing 4027 preference labels.
  • Built CMI-RewardBench to evaluate preference learning methods on compositional multimodal music instructions.
  • Trained a parameter-efficient reward model that reaches near-SOTA correlation with human labels across multiple evaluation protocols and supports inference-time scaling for music generation models.

You can also find my articles on My Google Scholar profile.