Title: Research on web page classification-based core characteristics and web structure
Authors: Geng Zengmin; Du Jianxia
Addresses: Computer Information Center, Beijing Institute of Fashion Technology, Beijing, China ' Computer Information Center, Beijing Institute of Fashion Technology, Beijing, China
Abstract: The explosive growth of web pages currently makes the research on web page classification technology a hotspot of web mining. This paper introduces experiment data of fashion document corpus by many feature selection and classification methods, gives characterising expressions for specific documents based on core feature terms and web page categorisation algorithm is put forward based on web structure. Through the classification experiment on fashion web pages corpus, the algorithm has higher accuracy rate than other classification algorithms, and thus improves several points relative to the result before adjustment on web structure. The algorithms studied in this paper can be applied in other domains besides web pages of fashions.
Keywords: web page classification; web mining; text classification; web structure; fashion web pages.
DOI: 10.1504/IJWMC.2014.062003
International Journal of Wireless and Mobile Computing, 2014 Vol.7 No.3, pp.253 - 257
Received: 15 Jul 2013
Accepted: 17 Jul 2013
Published online: 31 Oct 2014 *