Back close

Attention-based Predominant Instruments Recognition in Polyphonic Music

Publication Type : Conference Paper

Source : Proceedings of 18th Sound and Music Computing Conference (SMC), Torino, Italy, 29 June – 01 July 2021, pp. 199–206

Url : https://zenodo.org/records/5043841

Campus : Coimbatore

School : School of Artificial Intelligence

Center : Center for Computational Engineering and Networking

Year : 2021

Abstract : Predominant instrument recognition in polyphonic music is addressed using the score-level fusion of two visual representations, namely, Mel-spectrogram and modgdgram. Modgdgram, a visual representation is obtained by stacking modified group delay functions of consecutive frames successively. Convolutional neural networks (CNN) with an attention mechanism, learn the distinctive local characteristics and classify the instrument to the group where it belongs. The proposed system is systematically evaluated using the IRMAS dataset with eleven classes. We train the network using fixed-length singlelabeled audio excerpts and estimate the predominant instruments from variable-length audio recordings. A wave generative adversarial network (WaveGAN) architecture is also employed to generate audio files for data augmentation. The proposed system reports a micro and macro F1 score of 0.65 and 0.60, respectively, which is 20.37% and 27.66% higher than those obtained by the state-of-the-art Han model. The experiments demonstrate the potential of CNN with attention mechanism on Mel-spectro/modgdgram fusion framework for the task of predominant instrument recognition.

Cite this Research Publication : Lekshmi C. Reghunath. and Rajeev Rajan., “Attention-based Predominant Instruments Recognition in Polyphonic Music”, in Proceedings of 18th Sound and Music Computing Conference (SMC), Torino, Italy, 29 June – 01 July 2021, pp. 199–206

Admissions Apply Now