Template-Type: ReDIF-Article 1.0 Author-Name: Francesca Rossignoli Author-X-Name-First: Francesca Author-X-Name-Last: Rossignoli Author-Name: Nicola Tommasi Author-X-Name-First: Nicola Author-X-Name-Last: Tommasi Title: Abnormal accrual estimation: an automation data analysis technique Abstract: Accounting studies rely on predictive analytics to estimate abnormal accruals as indicators of managerial opportunism. Abnormal accruals are estimated by running predictive models and manually imposing a combination of conditions to select the control sample. This process is executed using loops where the estimation is repeated over the control observations meeting the combined conditions. The recursive estimation generates several inefficiencies. We provide a technique to estimate abnormal measures by automatising: i) the estimation of the predictive model; and ii) the selection of the control sample according to multiple procedures. The command offers a unique information set about the estimation results and process. We illustrate the use of <span style="font-family:serif;font-size:110%;letter-spacing:1px;">abnormalest</span> through empirical applications. We compare the accuracy of predictions under different approaches and models. The command <span style="font-family:serif;font-size:110%;letter-spacing:1px;">abnormalest</span> allows to overcome the inefficiencies, provides a unique set of information about the estimation, and is extendible to every social science. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 1-18 Issue: 5 Volume: 17 Year: 2025 Keywords: abnormal estimation; abnormal accrual; earnings management; prediction model; financial accounting. File-URL: http://www.inderscience.com/link.php?id=144576 File-Format: text/html File-Restriction: Open Access Handle: RePEc:ids:injdan:v:17:y:2025:i:5:p:1-18 Template-Type: ReDIF-Article 1.0 Author-Name: Irfan Saleem Author-X-Name-First: Irfan Author-X-Name-Last: Saleem Author-Name: Ali Irfan Author-X-Name-First: Ali Author-X-Name-Last: Irfan Title: Machine learning made easy: a beginner's guide for causal inference and discovery methods using Python Abstract: Machine learning is widely recognised and extensively used for data modelling and prediction across fields, including business and healthcare, to name a few of them, for informed decision-making. Numerous machine learning algorithms have been devised and deployed across multiple programming languages throughout the preceding decades for causal inference and discovery. This research, however, briefly introduces causal inference and discovery methods, accompanied by Python code for beginners. First, this study talks about machine learning in brief. Then, this study differentiates between causal discovery and causal inference. Thirdly, the study aims to describe popular machine-learning methods. Finally, this paper demonstrates the practical uses of these causal inference and discovery packages in Python. The study has recommended future research and implications for using machine learning methods. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 36-53 Issue: 1 Volume: 17 Year: 2025 Keywords: Python; machine learning; causal discovery (CD); causal inference (CI); linear regression; Peter-Clark (PC) algorithm; artificial intelligence. File-URL: http://www.inderscience.com/link.php?id=144962 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:1:p:36-53 Template-Type: ReDIF-Article 1.0 Author-Name: Satish N. Gujar Author-X-Name-First: Satish N. Author-X-Name-Last: Gujar Author-Name: Ashish Gupta Author-X-Name-First: Ashish Author-X-Name-Last: Gupta Author-Name: Sanjay Kumar P. Pingat Author-X-Name-First: Sanjay Kumar P. Author-X-Name-Last: Pingat Author-Name: Rashmi Pandey Author-X-Name-First: Rashmi Author-X-Name-Last: Pandey Author-Name: Atul Kumar Author-X-Name-First: Atul Author-X-Name-Last: Kumar Author-Name: Deepak Gupta Author-X-Name-First: Deepak Author-X-Name-Last: Gupta Author-Name: Priya Pise Author-X-Name-First: Priya Author-X-Name-Last: Pise Title: Brain tumour detection and multi classification using GNB-based machine learning architecture Abstract: Brain tumours are abnormal tissues with rapidly reproducing cells, posing significant challenges for identification and treatment. This study proposes a multimodal approach using machine learning and medical techniques for early diagnosis and segmentation of brain tumours. Noisy magnetic resonance imaging (MRI) are processed with a geometric mean to simplify noise removal. Fuzzy c-means algorithms segment the images, aiding in the detection of specific areas of interest. The grey-level co-occurrence matrix (GLCM) algorithm is used for dimension reduction and feature extraction. Various machine learning techniques, including Convolutional Neural Networks (CNN), Artificial Neural Networks (ANN), Support Vector Machine (SVM), Gaussian Naive Bayes (NB), and Adaptive Boosting, classify the images. Among these methods, Gaussian NB is particularly effective for identifying and classifying brain tumours. This approach leverages advanced AI and neural network techniques to enhance early diagnosis and improve treatment outcomes. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 20-35 Issue: 1 Volume: 17 Year: 2025 Keywords: machine learning; GLCM; grey-level co-occurrence matrix; Gaussian Naive Bayes; adaptive boosting; MRI; magnetic resonance imaging. File-URL: http://www.inderscience.com/link.php?id=144963 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:1:p:20-35 Template-Type: ReDIF-Article 1.0 Author-Name: Yimiao Zhang Author-X-Name-First: Yimiao Author-X-Name-Last: Zhang Author-Name: Jing Ren Author-X-Name-First: Jing Author-X-Name-Last: Ren Author-Name: Wenting Liu Author-X-Name-First: Wenting Author-X-Name-Last: Liu Author-Name: Ding Ding Author-X-Name-First: Ding Author-X-Name-Last: Ding Title: Application of text mining analysis in understanding GameFi adoption Abstract: Blockchain-based gaming industry has been expanding over the past two years, but the GameFi sector has yet to solve its biggest problem - the lack of mass gamer adoption. In this work, text mining was leveraged to study the adoption status of GameFi and explore the possible requirements and concerns of game players regarding blockchain games. Quora questions relating to GameFi were collected to examine the key topics discussed by GameFi users or potential users. Our findings disclosed that GameFi is in the early stage of the innovation diffusion process and has not been widely adopted by the public. Individuals are concerned about the risk and return of play-to-earn (P2E) games, and some potential users are deterred by the high entry barriers of GameFi. Through studying the opinions of players or potential players, this study sheds some light on the possible strategies for improving blockchain game design in the near future. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 1-19 Issue: 1 Volume: 17 Year: 2025 Keywords: GameFi; P2E; play-to-earn; mass adoption; text analysis. File-URL: http://www.inderscience.com/link.php?id=144964 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:1:p:1-19 Template-Type: ReDIF-Article 1.0 Author-Name: Jyoti Deone Author-X-Name-First: Jyoti Author-X-Name-Last: Deone Author-Name: Nilima Dongre Author-X-Name-First: Nilima Author-X-Name-Last: Dongre Author-Name: Mohammad Atique Author-X-Name-First: Mohammad Author-X-Name-Last: Atique Title: Prediction of success factors for mobile application using machine learning technique Abstract: The remarkable boom in the mobile market has attracted many developers to build mobile apps. However, the majority of developers are suffering to generate earnings. For those developers, knowing the characteristics of successful apps may be very vital. We propose an approach which examines the categories of apps by two factors. First, the correlation is measured between app features and secondly, concepts are extracted from apps to understand the common theme present in them. For this, we selected 3000 applications available in the Google Play Store. The observations specify that there may be a strong correlation among purchaser rating and the quantity of app downloads, though there may be no correlation between rate and downloads, nor among charge and rating. Moreover, we find standards unique to excessive rated apps and low rated apps. The correlation along with the concepts proves useful for application developers to understand the market trend and customer demand more easily than earlier approaches. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 54-64 Issue: 1 Volume: 17 Year: 2025 Keywords: Android; LSA; correlation; mobile; market; extraction; downloads; apps; customers; playstore; Google. File-URL: http://www.inderscience.com/link.php?id=144965 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:1:p:54-64 Template-Type: ReDIF-Article 1.0 Author-Name: Donald Douglas Atsa'am Author-X-Name-First: Donald Douglas Author-X-Name-Last: Atsa'am Author-Name: Gabriel Shimasaan Iorundu Author-X-Name-First: Gabriel Shimasaan Author-X-Name-Last: Iorundu Author-Name: Moses Terkula Ukeyima Author-X-Name-First: Moses Terkula Author-X-Name-Last: Ukeyima Title: Nutritional cluster analysis of leguminous food sources across West Africa Abstract: The present form of the data on West African legumes reported in the West Africa Food Composition Table (WAFCT) do not reflect sub-groupings based on (dis)similarity in nutritive value. A possible consequence is that an uninformed user interested in leguminous food could randomly pick any from the data since all are summarily classified as one family in the WAFCT. To resolve this, the objective of this study was to apply the clustering technique to form sub-groups based on similarity in nutritional content. Three clusters were extracted, and unique properties have been established for food sources in each cluster at the granular level of nutrients. Going by the clustering, users who are interested/not interested in a particular content could look up the cluster with a lower, moderate, or higher content of the desired/non-desired element. The results are useful in the selection of raw materials, formulation of nutritional guidelines, and food labelling. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 65-75 Issue: 1 Volume: 17 Year: 2025 Keywords: legumes; nutritional analysis; legumes food sources; WAFCT; West Africa Food Composition Table; k-means clustering. File-URL: http://www.inderscience.com/link.php?id=144966 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:1:p:65-75 Template-Type: ReDIF-Article 1.0 Author-Name: G. Umarani Srikanth Author-X-Name-First: G. Umarani Author-X-Name-Last: Srikanth Author-Name: Lijetha C. Jaffrin Author-X-Name-First: Lijetha C. Author-X-Name-Last: Jaffrin Author-Name: Sushmitha Srikanth Author-X-Name-First: Sushmitha Author-X-Name-Last: Srikanth Author-Name: Shyam Ramesh Author-X-Name-First: Shyam Author-X-Name-Last: Ramesh Title: Tackling data sparsity: a hybrid filtering paradigm for robust recommender systems Abstract: This paper introduces a hybrid recommender system approach that aims to tackle the problems associated with data sparsity, also referred to as the 'cold start problem', Recommender systems use user preferences to filter information. To improve recommendation accuracy, our method combines user-based and content-based collaborative filtering techniques. More specifically, content-based filtering takes over when there is little data. When there is a high degree of user similarity, user-based collaborative filtering is used to maximise accuracy by suggesting diverse items. This strategy can be used in a variety of fields, including e-commerce, music, books, and film. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 77-106 Issue: 2 Volume: 17 Year: 2025 Keywords: hybrid filtering; recommender systems; collaborative filtering; SVD; singular value decomposition; machine learning; k-nearest neighbours. File-URL: http://www.inderscience.com/link.php?id=147515 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:2:p:77-106 Template-Type: ReDIF-Article 1.0 Author-Name: A. Anushya Author-X-Name-First: A. Author-X-Name-Last: Anushya Author-Name: Savita Shiwani Author-X-Name-First: Savita Author-X-Name-Last: Shiwani Author-Name: Ayush Shrivastava Author-X-Name-First: Ayush Author-X-Name-Last: Shrivastava Title: Jasminum Grandiflorum flower images classification: deep learning and transfer learning models with the influence of preprocessing via contours and convex hull in Agritech 4.0 Abstract: This study specifically centres on classifying Jasminum Grandiflorum flowers through the utilisation of deep learning and transfer learning techniques. To achieve this, the research leverages advanced deep learning models such as CNNs, along with transfer learning using pre-trained architectures like VGG16, VGG19, ResNet18, and Vision Transformer. CNN stood out, excelling after extensive iterations. VGG 16 and 19 showed solid performance with fewer iterations, indicating competence in shorter training times. ResNet18 achieved the highest accuracy with fewer iterations but took longer (about 8 minutes per epoch), balancing efficiency and accuracy. ViT impressed with high accuracy despite needing more iterations, showcasing prowess in intricate learning and pattern recognition in the Jasminum Grandiflorum flower image dataset. The intended outcome of this research is to contribute significantly to the advancement of Agritech 4.0 by establishing a robust methodology for accurate Jasminum Grandiflorum flower classification without human participation. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 160-175 Issue: 2 Volume: 17 Year: 2025 Keywords: CNN; convolutional neural network; VGG16; VGG19; ResNet18; ViT; vision transformer; Jasminum Grandiflorum; AgriTech 4.0. File-URL: http://www.inderscience.com/link.php?id=147516 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:2:p:160-175 Template-Type: ReDIF-Article 1.0 Author-Name: Semeneh Hunachew Bayih Author-X-Name-First: Semeneh Hunachew Author-X-Name-Last: Bayih Author-Name: Surafel Lulseged Tilahun Author-X-Name-First: Surafel Lulseged Author-X-Name-Last: Tilahun Title: Adaptive parking demand prediction using discrete time based dynamic Markov chain Abstract: The demand for urban parking rapidly increases and becomes a significant traffic issue in densely populated metropolitan regions. Prediction of parking demand is crucial for reducing traffic jams and decreasing greenhouse gas emissions. It is also essential to the development of parking facilities and price adjustments in urban parking planning. Most of the earlier studies developed model for parking demand prediction using historical data which lack to update the demand data. Furthermore, the demand predictions are not considering the effect of parking pricing. However, parking pricing affects the demand in a given parking platform. To address this issue, we have considered three categories of parking demand based on price based preference. Dynamic non-homogeneous Markov chain with discrete time and discrete state is used to predict the parking demand. An adaptive approach or a learning approach is proposed to make the Markov chain dynamic and to adapt changes in the demand environment. A numerical example demonstrating the prediction from data collection as well as incorporating the adaptive strategy so that the system learning new changes, is presented. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 140-159 Issue: 2 Volume: 17 Year: 2025 Keywords: prediction; parking demand; Markov chain model; adaptive learning; discrete time; demand categorisation. File-URL: http://www.inderscience.com/link.php?id=147517 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:2:p:140-159 Template-Type: ReDIF-Article 1.0 Author-Name: Wan Aezwani Wan Abu Bakar Author-X-Name-First: Wan Aezwani Wan Abu Author-X-Name-Last: Bakar Author-Name: Muhammad Amierusyahmi Bin Zuhairi Author-X-Name-First: Muhammad Amierusyahmi Bin Author-X-Name-Last: Zuhairi Author-Name: Mustafa Bin Man Author-X-Name-First: Mustafa Bin Author-X-Name-Last: Man Author-Name: Nur Laila Najwa Bt Josdi Author-X-Name-First: Nur Laila Najwa Bt Author-X-Name-Last: Josdi Title: Enhancing healthcare predictions with deep learning: insights from image datasets Abstract: This study builds on prior research to improve healthcare predictions using deep learning with image datasets. Unlike numerical data, image processing in deep learning faces challenges such as large data volume, storage demands, computational resource needs, manual annotation, class imbalance, overfitting, and scalability issues. Effective solutions require robust preprocessing, efficient computation, thoughtful model design, and ethical considerations. This paper presents a 3-layer deep convolutional neural network (DCNN) to integrate image datasets, achieving 99% accuracy on benchmark datasets, including the brain tumor medical dataset (BTMD). The model employs dropout regularisation and incorporates numeric data insights, showcasing adaptability across different healthcare data types. These results highlight the significant potential of DCNNs for high-accuracy predictions in medical applications. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 107-120 Issue: 2 Volume: 17 Year: 2025 Keywords: image dataset; DCNN; deep convolutional neural network; BTMD; brain tumour medical dataset; prediction accuracy; healthcare applications; healthcare prediction; deep learning. File-URL: http://www.inderscience.com/link.php?id=147518 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:2:p:107-120 Template-Type: ReDIF-Article 1.0 Author-Name: Lina M. Lozano-Suarez Author-X-Name-First: Lina M. Author-X-Name-Last: Lozano-Suarez Author-Name: Fabian A. Torres-Cardenas Author-X-Name-First: Fabian A. Author-X-Name-Last: Torres-Cardenas Author-Name: Eduardo Rangel Díaz Author-X-Name-First: Eduardo Rangel Author-X-Name-Last: Díaz Title: A data analytics approach to improve the international supply of metal inputs in the metal-mechanical sector in Colombia Abstract: The metal-mechanical sector is vital to Colombia's industry, significantly contributing to economic development. To ensure its growth, this sector must enhance competitiveness, particularly in managing metal supplies, often imported. Analysing imports is crucial, but data from DIAN is unprocessed and provided in extensive Excel microdata packages, requiring processing. This study proposes a data analytics approach combining descriptive and predictive analyses. Descriptive analysis using DIAN's 2023 data identifies key import factors: major supplier countries, main customs entries, locations of top importers, and common transport modes. Predictive analysis using regression, decision trees, and k-NN models predicts import quantities based on free on board (FOB) value, with regression showing the highest accuracy. This approach helps companies understand factors affecting imports, such as transportation, customs management, cargo handling, and preparation, facilitating better decision-making and competitiveness. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 121-139 Issue: 2 Volume: 17 Year: 2025 Keywords: data analytics; supply chain; machine learning; international supply; regression model; decision tree; k-NN; metal-mechanical sector; CRISP-DM; dashboard. File-URL: http://www.inderscience.com/link.php?id=147519 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:2:p:121-139 Template-Type: ReDIF-Article 1.0 Author-Name: Samira Hazmoune Author-X-Name-First: Samira Author-X-Name-Last: Hazmoune Author-Name: Fateh Bougamouza Author-X-Name-First: Fateh Author-X-Name-Last: Bougamouza Title: Emoji translation for sentiment analysis in Algerian Arabic dialect Abstract: Sentiment analysis (SA) is an important natural language processing (NLP) field that involves extracting sentiments and opinions from text data. Although SA has advanced significantly, its application to dialectal Arabic text presents challenges due to linguistic nuances and resource constraints. This research investigates the incorporation of emojis into SA for Algerian Arabic dialect (AAD), marking the first exploration of its kind in this area. Specifically, we focus on emoji translation, building upon prior studies highlighting emojis, potential in SA and their translation into meaningful words or sentences as a preprocessing approach. We evaluate the impact of this approach on enhancing sentiment classification in AAD text, specifically focusing on customer reviews of Algerian telephone operators. After preprocessing, including various emoji translation techniques, we employ transfer learning by fine-tuning DziriBERT model on a compiled Algerian dialect dataset. Our results demonstrate promising outcomes and offer novel conclusions and perspectives in AAD sentiment analysis. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 216-237 Issue: 3 Volume: 17 Year: 2025 Keywords: sentiment analysis; emoji translation; DziriBERT; AAD; Algerian Arabic dialect; transfer learning; emoji categorisation; emoji handling; customer reviews. File-URL: http://www.inderscience.com/link.php?id=148561 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:3:p:216-237 Template-Type: ReDIF-Article 1.0 Author-Name: Md Nurul Islam Author-X-Name-First: Md Nurul Author-X-Name-Last: Islam Author-Name: Iqbal Hasan Author-X-Name-First: Iqbal Author-X-Name-Last: Hasan Author-Name: Shahla Tarannum Author-X-Name-First: Shahla Author-X-Name-Last: Tarannum Author-Name: S.M.K. Quadri Author-X-Name-First: S.M.K. Author-X-Name-Last: Quadri Title: Analysis of online transaction using data analytics framework Abstract: Nowadays, online transactions become a necessity for everyone; thus, they generate a vast amount of data, which requires a robust framework to ensure their security, efficiency, and reliability. This research paper explores the application of advanced data analytics techniques to ensure and enhance the confidentiality of the online transaction process. Using this analytics framework, we can analyse patterns, detect anomalies, and predict trends with online transaction data. An online survey was conducted to collect data from one lakh consumers of different geographical regions and diverse working groups. Descriptive analysis has been used in this study to ascertain the present state of online transactions. The study investigates the significance of feature selection, anomaly detection, and clustering methods in identifying patterns, trends, and potential fraud indicators within online transactions. The findings of this research contribute to the growing body of knowledge on leveraging data analytics frameworks to extract valuable insights from online transaction data. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 177-195 Issue: 3 Volume: 17 Year: 2025 Keywords: online transactions; data analytics; online payment; security; e-commerce; analysis. File-URL: http://www.inderscience.com/link.php?id=148562 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:3:p:177-195 Template-Type: ReDIF-Article 1.0 Author-Name: Himani S. Deshpande Author-X-Name-First: Himani S. Author-X-Name-Last: Deshpande Author-Name: Leena Ragha Author-X-Name-First: Leena Author-X-Name-Last: Ragha Title: An empirical examination of classification algorithms and resampling strategies for dealing with imbalanced datasets: a comparative analysis Abstract: Imbalanced datasets can lead to biased models and inaccurate predictions, thus making it a crucial issue to be addressed. This research comprehensively analyses issues, approaches and evaluation parameters to work with imbalanced dataset based machine learning models. Literature suggests that data imbalance handling methods are categorised into three broad categories namely pre-processing methods, cost-sensitive learning, and ensemble methods. Experiments are conducted to test popular classifiers in combination with three pre-processing methods namely clustered smote, random over sampling, and scaled values on seven standard imbalanced datasets. The results of study show that Random Forest classifier with Random Over Sampling pre-processing method, performed best for most of the datasets with precision values between 0.68 to 1, AUC values between 0.83-1, and prediction accuracy between 76.1-99.8%. This study highlights that the choice of the evaluation metric and the pre-processing method can have a significant impact on the performance of the classifier. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 238-253 Issue: 3 Volume: 17 Year: 2025 Keywords: imbalanced data; over sampling; undersampling; classification; cost sensitive; ensemble learning; feature weighing; instance weighing. File-URL: http://www.inderscience.com/link.php?id=148563 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:3:p:238-253 Template-Type: ReDIF-Article 1.0 Author-Name: Silvano Herculano da Luz Júnior Author-X-Name-First: Silvano Herculano da Luz Author-X-Name-Last: Júnior Author-Name: Willian Farias Carvalho Oliveira Author-X-Name-First: Willian Farias Carvalho Author-X-Name-Last: Oliveira Author-Name: Luis Cesar de Albuquerque Neto Author-X-Name-First: Luis Cesar de Albuquerque Author-X-Name-Last: Neto Author-Name: Hugo Araujo Souza Author-X-Name-First: Hugo Araujo Author-X-Name-Last: Souza Author-Name: Yúri Faro Dantas de Sant'Anna Author-X-Name-First: Yúri Faro Dantas de Author-X-Name-Last: Sant'Anna Title: A cross-sectional analysis of severe SARS cases evolution in a Brazilian municipality using data mining techniques Abstract: The first severe acute respiratory syndrome (SARS) outbreak occurred in China in 2002, followed by other coronavirus variants like MERS (2012), 2019-nCOV (2019), and Omicron (2020). While data mining (DM) has been widely used for SARS classification and decision-making, most studies overlook socioeconomic factors such as income and education. This study applies the cross-industry standard process for data mining (CRISP-DM) framework and DM techniques to predict severe SARS case progression in Recife, Brazil. Using open datasets, it incorporates attributes related to symptoms, pre-existing conditions, and socioeconomic indicators. Three healthcare experts participated in the analysis. Results showed that the apriori algorithm performed best in rule induction, while the decision tree slightly outperformed logistic regression. Notably, correlations emerged between severe case progression and socioeconomic data, underscoring the importance of integrating social determinants in disease classification models. These findings provide insights for improving predictive models and public health strategies. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 196-215 Issue: 3 Volume: 17 Year: 2025 Keywords: SARS; severe acute respiratory syndrome; data mining; machine learning; apriori; ROC curve; CRISP-DM; cross-industry standard process for data mining. File-URL: http://www.inderscience.com/link.php?id=148564 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:3:p:196-215 Template-Type: ReDIF-Article 1.0 Author-Name: Korakot Wichitsa-nguan Jetwanna Author-X-Name-First: Korakot Wichitsa-nguan Author-X-Name-Last: Jetwanna Author-Name: Orathai Yongseng Author-X-Name-First: Orathai Author-X-Name-Last: Yongseng Author-Name: Supanan Kongmee Author-X-Name-First: Supanan Author-X-Name-Last: Kongmee Author-Name: Tanongsak Sukyareak Author-X-Name-First: Tanongsak Author-X-Name-Last: Sukyareak Author-Name: Wasun Bunyod Author-X-Name-First: Wasun Author-X-Name-Last: Bunyod Author-Name: Chidchanok Choksuchat Author-X-Name-First: Chidchanok Author-X-Name-Last: Choksuchat Author-Name: Nuntouchaporn Prateepausanont Author-X-Name-First: Nuntouchaporn Author-X-Name-Last: Prateepausanont Author-Name: Thanathip Limna Author-X-Name-First: Thanathip Author-X-Name-Last: Limna Title: Improving public health outcomes through accurate UV index forecasting: ARIMA and ANN approach in Songkhla Province Abstract: This research forecasts the UV Index using five weather parameters: temperature, dew point, humidity, wind speed, and atmospheric pressure in Muang District, Songkhla Province, over a period of 1000 days (from March 6, 2021, to November 30, 2023). It employs a combined autoregressive integrated moving average (ARIMA) and artificial neural network (ANN) model for prediction. The ARIMA model outputs were further used to forecast the UV index with ANN, yielding high accuracy. The dataset was processed to handle missing data using median values. Results showed that the ARIMA model had the MAPE of 0.04% to 26.49%, MAE of 0.3% to 4.3%, and RMSE of 0.4-5.4%. Meanwhile, the ANN model demonstrated an accuracy of 94.2%. Journal: Int. J. of Data Analysis Techniques and Strategies Pages: 254-277 Issue: 3 Volume: 17 Year: 2025 Keywords: UV index prediction; ARIMA; autoregressive integrated moving average; ANN; artificial neural networks; weather parameters; public health outcomes. File-URL: http://www.inderscience.com/link.php?id=148565 File-Format: text/html File-Restriction: Access to full text is restricted to subscribers. Handle: RePEc:ids:injdan:v:17:y:2025:i:3:p:254-277