Title: A novel similarity measure: Voronoi audio similarity for genre classification
Authors: Prafulla Kalapatapu; N.N. Tejas; Siddharth Dalmia; Prakhar Gupta; Bhaswant Inguva; Aruna Malapati
Addresses: Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet, Hyderabad, Telangana, 500068, India ' Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet, Hyderabad, Telangana, 500068, India ' Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet, Hyderabad, Telangana, 500068, India ' Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet, Hyderabad, Telangana, 500068, India ' Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet, Hyderabad, Telangana, 500068, India ' Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet, Hyderabad, Telangana, 500068, India
Abstract: One of the major challenges in genre classification, recommender systems is to find similarity between the query song and songs in a database. In this paper, we propose a novel similarity measure called Voronoi audio similarity (VAS). We extracted the Content-based features from the audio signal of the song split in frames over a particular time period and we represented each song as a point in 2D space. The proposed system is a two-level classification process, where songs are first clustered by K-means clustering and then a Voronoi diagram is created using centroids from the resulting K-means, which is called the template Voronoi diagram (TVD). This approach learns the decision boundary used for genre classification. The genre of the song could thus be predicted as the genre with the maximum normalised area overlap. Empirical results performed with 10 cross-fold validations on million song subsets of 500 songs showed 78% accuracy.
Keywords: audio similarity measure; genre classification; content-based recommendation; music information retrieval.
DOI: 10.1504/IJISTA.2017.088054
International Journal of Intelligent Systems Technologies and Applications, 2017 Vol.16 No.4, pp.309 - 318
Received: 11 Apr 2016
Accepted: 13 Feb 2017
Published online: 20 Nov 2017 *