Open Access Article

Title: Analysis of an intelligent piano music transcription model by deep reinforcement learning

Authors: Yan Hu; Jing Wang

Addresses: Humanities Quality Education Center, University of Science and Technology, Beijing 100083, China ' Humanities Quality Education Center, University of Science and Technology, Beijing 100083, China

Abstract: To improve the accuracy of automatic piano music transcription in complex environments, a recognition system applicable to practical scenarios such as music education assistance and intelligent performance analysis was developed. First, audio features were extracted using Log-Mel spectrograms, combined with data augmentation and adaptive pitch normalisation to enhance model robustness. Second, a state-action modelling mechanism integrating a Transformer encoder with a multidimensional action space was constructed to precisely represent note content, rhythmic positions, and dynamics information. Finally, a primary policy and an auxiliary rhythm policy based on proximal policy optimisation (PPO) were designed, and a multidimensional reward function along with imitation learning signals were introduced to jointly optimise the note prediction strategy. Comparative experiments indicated that incorporating the multidimensional action structure and boundary auxiliary strategy significantly improved recognition accuracy. The proposed method achieves high-precision piano audio transcription with strong structural continuity.

Keywords: piano transcription; deep reinforcement learning; DRL; multidimensional action space; music sequence modelling; proximal policy optimisation; PPO.

DOI: 10.1504/IJICT.2026.151494

International Journal of Information and Communication Technology, 2026 Vol.27 No.3, pp.18 - 35

Received: 29 Sep 2025
Accepted: 10 Nov 2025

Published online: 02 Feb 2026 *