Open Access Article

Title: Multi-instrument polyphonic automatic transcription method combining gated recurrent units and DeepLabv3+ model

Authors: Xiaochen Ye

Addresses: Music College, Neijiang Normal University, Neijiang, 641100, China

Abstract: A multi-instrument polyphonic automatic transcription method integrating bidirectional gated recurrent units and an improved Deeplabv3+ network is proposed to enhance transcription accuracy under complex audio conditions. A pre-separation module first performs source separation and denoising. Frequency-harmonic composite features are then extracted, and temporal dependencies are modelled using a gated recurrent network, followed by lightweight decoding for note onset localisation and instrument classification. Experiments show that the proposed model achieves 92.8%, 91.5%, and 92.1% accuracy, recall, and F1 on the training set, and 91.2%, 88.7%, and 89.9% on the test set, surpassing baseline methods. In mixed-instrument scenarios, the model attains an average F1 of 83.65% and 88.3% note recognition accuracy, improving piano-violin transcription by 7%. The method offers high precision and robustness for polyphonic transcription, providing a practical foundation for intelligent music analysis and automatic orchestration.

Keywords: multi-instrument polyphonic auto-transcription; DeepLabv3+ network; bi-directional gated loop unit; audio feature extraction; preamplifier separation.

DOI: 10.1504/IJICT.2026.151529

International Journal of Information and Communication Technology, 2026 Vol.27 No.4, pp.69 - 90

Received: 17 Oct 2025
Accepted: 01 Dec 2025

Published online: 04 Feb 2026 *