Authors: Emilio Corchado, Donald MacDonald, Colin Fyfe
Addresses: Applied Computational Intelligence Research Unit, The University of Paisley, UK. ' Applied Computational Intelligence Research Unit, The University of Paisley, UK. ' Applied Computational Intelligence Research Unit, The University of Paisley, UK
Abstract: We review a technique for creating self-organising maps (SOMs) in a feature space, which is nonlinearly related to the original data space. We show that convergence is remarkably fast for this method. We then consider the case where an internet agent quantises internet (HTML) documents using this method. The resulting map has two properties, which are interesting from a human perspective: first, the learning forms topology preserving mappings extremely quickly; second, the learning is most refined for those parts of the feature space which is learned first and which have most data. By considering the linear feature space, we show that it is the interaction between the overcomplete basis in which learning takes place and the mixture of one-shot and incremental learning which comprises the method that gives the method its power. Finally, as a tool to quantise document collections, the tool has the invaluable property that it scales with the number of documents on a site rather than the number of distinct words. We illustrate its use on a standard dataset.
Keywords: self-organising maps; kernel space; internet agents; feature space; concept formation; html documents; human learning; neural networks; document classification; text contents.
International Journal of Web Engineering and Technology, 2004 Vol.1 No.4, pp.427 - 436
Available online: 12 Feb 2005 *Full-text access for editors Access for subscribers Purchase this article Comment on this article