Title: Biomedical text mining algorithm based on domain ontology

Authors: Xiaoling Jiang; Hui Zhang; Jiaming Xu; Weicheng Wu; Xun Mo

Addresses: Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huai'an 223003, China ' Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huai'an 223003, China ' Office of Academic Affairs, Huaiyin Institute of Technology, Huai'an 223003, China ' Dean's Office, Huaiyin Institute of Technology, Huai'an 223003, China ' Technology Faculty of Applied Technology, Huaiyin Institute of Technology, Huai'an 223003, China

Abstract: To reduce the cost of mining time and generalisation error of mining results, and improve the recall rate of mining results, this paper designs a biomedical text mining algorithm based on domain ontology. The text information retrieval is completed based on the text length anomaly coefficient and the mixed features of the text data. The gene ontology and disease ontology in the domain ontology are analysed to identify their ontology names. The relationship between different text concepts is found according to the feature weight of text word frequency, and the LDA model is used to effectively mine biomedical text. The experimental results show that the maximum mining time required by this method is only 0.88 min, the minimum generalisation error of mining results is only 0.011, and the recall rate is always above 85%, which shows that this method effectively achieves the design expectation.

Keywords: biomedical science; text mining; domain ontology; ontology naming; feature weight.

DOI: 10.1504/IJDMB.2022.130337

International Journal of Data Mining and Bioinformatics, 2022 Vol.27 No.1/2/3, pp.187 - 200

Received: 22 Aug 2022
Accepted: 18 Oct 2022

Published online: 17 Apr 2023 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article