Authors: Gang Tian; Chengai Sun; Ke-qing He; Xiang-min Ji
Addresses: College of Information and Science Engineering, Shandong University of Science and Technology, Shandong Qingdao 266590, China; State Key Laboratory of Software Engineering, School of Computer, Wuhan University, Hubei Wuhan 430072, China ' College of Information and Science Engineering, Shandong University of Science and Technology, Shandong Qingdao 266590, China ' State Key Laboratory of Software Engineering, School of Computer, Wuhan University, Hubei Wuhan 430072, China ' College of Computer and Information Science, Fujian Agriculture and Forestry University, Fujian Fuzhou 350002, China
Abstract: The growing number of web services puts forward higher requirements for searching desired web services, and clustering web services can greatly enhance the discovery of web service. Most existing clustering approaches are designed to handle long text documents. However, the descriptions of most services are in the form of short text, which impairs the quality of clustering owing to the lack of statistical information. To solve this problem, we propose a new service clustering approach based on transfer learning from auxiliary long text data obtained from Wikipedia. To handle the inconsistencies in semantics between service descriptions and auxiliary data, we introduce a novel topic model - dual tag aided latent Dirichlet allocation (DT-LDA), which jointly learns two sets of topics on the two datasets. Experimental results show the proposed approach achieves better performance than several existing approaches.
Keywords: web service clustering; auxiliary knowledge; topic modelling; transfer learning; knowledge transfer; web services; service discovery; long text data; semantics; service descriptions; dual tag LDA; latent Dirichlet allocation.
International Journal of High Performance Computing and Networking, 2016 Vol.9 No.1/2, pp.160 - 169
Available online: 12 Feb 2016 *Full-text access for editors Access for subscribers Purchase this article Comment on this article