Title: A novel similarity measure: Voronoi audio similarity for genre classification

Authors: Prafulla Kalapatapu; N.N. Tejas; Siddharth Dalmia; Prakhar Gupta; Bhaswant Inguva; Aruna Malapati

Addresses: Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet, Hyderabad, Telangana, 500068, India ' Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet, Hyderabad, Telangana, 500068, India ' Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet, Hyderabad, Telangana, 500068, India ' Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet, Hyderabad, Telangana, 500068, India ' Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet, Hyderabad, Telangana, 500068, India ' Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet, Hyderabad, Telangana, 500068, India

Abstract: One of the major challenges in genre classification, recommender systems is to find similarity between the query song and songs in a database. In this paper, we propose a novel similarity measure called Voronoi audio similarity (VAS). We extracted the Content-based features from the audio signal of the song split in frames over a particular time period and we represented each song as a point in 2D space. The proposed system is a two-level classification process, where songs are first clustered by K-means clustering and then a Voronoi diagram is created using centroids from the resulting K-means, which is called the template Voronoi diagram (TVD). This approach learns the decision boundary used for genre classification. The genre of the song could thus be predicted as the genre with the maximum normalised area overlap. Empirical results performed with 10 cross-fold validations on million song subsets of 500 songs showed 78% accuracy.

Keywords: audio similarity measure; genre classification; content-based recommendation; music information retrieval.

DOI: 10.1504/IJISTA.2017.088054

International Journal of Intelligent Systems Technologies and Applications, 2017 Vol.16 No.4, pp.309 - 318

Received: 11 Apr 2016
Accepted: 13 Feb 2017

Published online: 20 Nov 2017 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article