Authors: Khushboo Chandel; Veenita Kunwar; A. Sai Sabitha; Abhay Bansal; Tanupriya Choudhury
Addresses: Amity University, Noida, Uttar Pradesh, India ' Amity University, Noida, Uttar Pradesh, India ' Amity University, Noida, Uttar Pradesh, India ' Amity University, Noida, Uttar Pradesh, India ' Department of Informatics, School of Computer Science, University of Petroleum and Energy Studies, Dehradun, India
Abstract: Data mining in medicine has been used to predict unknown patterns in health data and to obtain diagnostic results. Healthcare industry generates large amounts of complex data about patients, diseases and treatments. Data mining in healthcare provides benefits like detecting fraud, availing medical facilities for patients at low cost, ensuring high quality patient care and making healthcare policies. Disease detection has become essential due to increased number of health issues occurring day by day. The thyroid has become one such concern with numerous cases being detected yearly. It causes improper functioning of the thyroid gland. In this paper, clustering technique has been used to detect and understand factors influencing thyroid disease. DBSCAN algorithm has been used as it can handle clusters of varying shapes and sizes and is noise resistant. PCA has also been done for finding high dimension data patterns and to reduce dimension. The experimental setup has been implemented in RapidMiner.
Keywords: data mining; clustering; thyroid disease; DBSCAN; principal component analysis.
International Journal of Business Intelligence and Data Mining, 2020 Vol.17 No.3, pp.273 - 297
Accepted: 26 Jan 2018
Published online: 24 Apr 2020 *