Authors: Clay Woolam, Latifur Khan
Addresses: University of Texas at Dallas, 800 West Campbell Road, Richardson, 75080, Texas, 972-883-4137, USA. ' University of Texas at Dallas, 800 West Campbell Road, Richardson, 75080, Texas, 972-883-4137, USA
Abstract: This paper looks into classification of documents that have hierarchical labels and are not restricted to a single label. Previous work in hierarchical classification focuses on the hierarchical perceptron (Hieron) algorithm. Hieron only supports single label learning. We investigate applying several standard multi-label learning techniques to Hieron. We then propose an extension of the algorithm (MultiHieron) that significantly outperforms all previously mentioned techniques. MultiHieron has a new aggregate loss function for multiple labels. Improvement is shown on the Aviation Safety Reporting System (ASRS) flight anomaly database and OntoNews corpus using both at and hierarchical categorisation metrics.
Keywords: document classification; semantic web; multi-label learning; multi-class; aviation safety; safety reporting; flight anomaly database; hierarchical perceptron; ontology; large margin perceptron; data mining; loss function; multiple labels.
International Journal of Data Mining, Modelling and Management, 2008 Vol.1 No.1, pp.5 - 22
Published online: 14 Jan 2009 *Full-text access for editors Access for subscribers Purchase this article Comment on this article