Title: Research on web page classification-based core characteristics and web structure

Authors: Geng Zengmin; Du Jianxia

Addresses: Computer Information Center, Beijing Institute of Fashion Technology, Beijing, China ' Computer Information Center, Beijing Institute of Fashion Technology, Beijing, China

Abstract: The explosive growth of web pages currently makes the research on web page classification technology a hotspot of web mining. This paper introduces experiment data of fashion document corpus by many feature selection and classification methods, gives characterising expressions for specific documents based on core feature terms and web page categorisation algorithm is put forward based on web structure. Through the classification experiment on fashion web pages corpus, the algorithm has higher accuracy rate than other classification algorithms, and thus improves several points relative to the result before adjustment on web structure. The algorithms studied in this paper can be applied in other domains besides web pages of fashions.

Keywords: web page classification; web mining; text classification; web structure; fashion web pages.

DOI: 10.1504/IJWMC.2014.062003

International Journal of Wireless and Mobile Computing, 2014 Vol.7 No.3, pp.253 - 257

Received: 15 Jul 2013
Accepted: 17 Jul 2013

Published online: 31 Oct 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article