Authors: Ratnesh K. Jain, Ramveer S. Kasana, Deepak Kumar Sahu, Suresh Jain
Addresses: Department of Computer Science and Applications, Dr. H.S. Gour Central University, Sagar, MP, India. ' Department of Computer Science and Applications, Dr. H.S. Gour Central University, Sagar, MP, India. ' Kendriya Vidyalaya, Dharamtekri, Chhindwara, MP, India. ' Department of Computer Engineering, Institute of Engineering & Technology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
Abstract: As the information available on the World Wide Web is increasing day-by-day, access to the websites is also increasing which results in huge amount of web log data (also called web usage data). Discovery and analysis of useful information from these web logs become a practical necessity. Frequent access pattern, which is the sequence of accesses pursued by users frequently, is one of the interesting and useful knowledge in practice. Web access pattern tree (WAP-tree) mining is a frequent pattern mining technique for web log access sequences, which first stores the original web access sequence database on a prefix tree for storing non-sequential data. WAP-tree algorithm then, mines the frequent sequences from the WAP-tree by recursively reconstructing intermediate trees, starting with suffix sequences and ending with prefix sequences. In this paper, we propose a more efficient algorithm named eWAP-mine (enhanced web access pattern mining algorithm), which is based directly on the initial conditional web access sequence base (1-CWASD) of each frequent event and eliminates the need for reconstructing intermediate conditional WAP-trees.
Keywords: web usage data; web access patterns; frequent pattern mining; WAP tree; sequence list; web log data.
International Journal of Data Mining, Modelling and Management, 2010 Vol.2 No.2, pp.176 - 193
Published online: 11 Mar 2010 *Full-text access for editors Access for subscribers Purchase this article Comment on this article