Inderscience PublishersInderscience PublishersInderscience Publishers
  PUBLISHERS OF DISTINGUISHED ACADEMIC, SCIENTIFIC AND PROFESSIONAL JOURNALS

Article Abstract

Title: TaxaMiner: an experimentation framework for automated taxonomy bootstrapping
  Author: Vipul Kashyap, Cartic Ramakrishnan, Christopher Thomas, A. Sheth   Email author(s)
  Address: Clinical Informatics R&D, Partners HealthCare System, 93 Worcester St., Wellesley, MA 02481,USA. ' LSDIS Lab, Department of Computer Science, University of Georgia, 415 GSRC, Athens, GA 30602, USA. ' LSDIS Lab, Department of Computer Science, University of Georgia, 415 GSRC, Athens, GA 30602, USA. ' LSDIS Lab, Department of Computer Science, University of Georgia, 415 GSRC, Athens, GA 30602, USA
  Journal: International Journal of Web and Grid Services 2005 - Vol. 1, No.2  pp. 240 - 266
  Abstract: Construction of domain ontologies on the semantic web is a human and resource intensive process, efforts to reduce which are crucial for the Semantic Web to scale. We present a framework for automated taxonomy construction, that involves: (a) generation of a cluster hierarchy from a document corpus using statistical clustering and NLP techniques; (b) extraction of a topic hierarchy from this cluster hierarchy; and (c) assignment of labels to nodes in the topic hierarchy. Metrics for estimating topic hierarchy quality and parameters of an experimentation framework are identified. MEDLINE was the document corpus and MeSH thesaurus was the gold standard.
  Keywords: semantic web; domain ontologies; topic hierarchies; taxonomies; ontology learning; automatic taxonomy generation; statistical clustering; natural language processing; taxonomy quality metrics; label assignment; taxonomy bootstrapping; cluster hierarchy; medical literature.
  DOI: 10.1504/IJWGS.2005.008322
  Access for editors and complimentary subscribers       Access for Subscribers   Purchase this Paper        We welcome your comments about this paper Comment on the Paper