Title: Topic-driven top-k similarity search by applying constrained meta-path based in content-based schema-enriched heterogeneous information network
Authors: Phu Pham; Phuc Do
Addresses: Faculty of Information Systems, University of Information Technology (UIT), VNU-HCM, Quarter 6, Linh Trung Ward, Thu Duc District, Ho Chi Minh City, Vietnam ' Faculty of Information Systems, University of Information Technology (UIT), VNU-HCM, Quarter 6, Linh Trung Ward, Thu Duc District, Ho Chi Minh City, Vietnam
Abstract: In this paper, we propose a model of TopCPathSim in order to address the problem related to 'topic-driven' similarity searching based on 'constrained meta-path' (or also called 'restricted meta-path') between same-typed objects within the content-based heterogeneous information networks (HINs). The topic distributions over content-based objects such as: paper/article on the bibliographic network or user's comments/reviews on the social networks, etc. are obtained by using the LDA topic model. We conduct the experiments on the real DBLP, Aminer and ACM datasets which demonstrate the effectiveness of our proposed model. Throughout experiments, our proposed model gains about 73.56% in accuracy. The output results also show that the combination of probabilistic topic model with constrained meta-path is promising to leverage the output quality of topic-oriented similarity searching in content-based HINs.
Keywords: constrained meta-path; content-based heterogeneous information network; topic-driven similarity search; LDA; topic modelling.
International Journal of Business Intelligence and Data Mining, 2020 Vol.17 No.3, pp.349 - 376
Received: 20 Dec 2017
Accepted: 14 Feb 2018
Published online: 24 Apr 2020 *