Title: Application of association rules mining to Named Entity Recognition and co-reference resolution for the Indonesian language

Authors: Indra Budi, Stephane Bressan

Addresses: Faculty of Computer Science, University of Indonesia, Indonesia. ' School of Computing, National University of Singapore, Singapore

Abstract: In this paper, we propose a new method, association rules mining for Named Entity Recognition (NER) and co-reference resolution. The method uses several morphological and lexical features such as Pronoun Class (PC) and Name Class (NC), String Similarity (SP) and Position (P) in the text, into a vector of attributes. Applied to a corpus of newspaper in the Indonesian language, the method outperforms state-of-the-art maximum entropy method in name entity recognition and is comparable with state-of-the-art machine learning methods, decision tree, for co-reference resolution.

Keywords: association rules mining; co-reference resolution; named entity recognition; NER; entity equivalence; Indonesian language; Indonesia; information extraction; data mining; information retrieval.

DOI: 10.1504/IJBIDM.2007.016382

International Journal of Business Intelligence and Data Mining, 2007 Vol.2 No.4, pp.426 - 446

Published online: 23 Dec 2007 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article