Template-Type: ReDIF-Article 1.0
Author-Name: Shambhu Kumar
Author-X-Name-First: Shambhu
Author-X-Name-Last: Kumar
Author-Name: Arti Jain
Author-X-Name-First: Arti
Author-X-Name-Last: Jain
Author-Name: Dinesh C.S. Bisht
Author-X-Name-First: Dinesh C.S.
Author-X-Name-Last: Bisht
Title: Enhancing link prediction in dynamic social networks: a novel algorithm integrating global and local topological structures
Abstract:
The link prediction problem has gained significant importance due to the emergence of many social networks. Existing link prediction algorithms in social networks often prioritise local or global attributes, yielding satisfactory performance on specific network types but with limitations like reduced accuracy or higher computational burden. This paper presents a novel link prediction approach that integrates global and local topological structures, assessing node similarity through a similarity index formula between two node pairs that is based on three key features: the number of common neighbours between nodes with some penalty factor introduced for each common node, node influence, and the shortest path distance between unconnected nodes. Evaluation using AUC has been performed against seven datasets and demonstrates significant improvement over baseline and state-of-the-art methods, enhancing accuracy by 30% and 6.75%. This highlights the efficacy of integrating global and local features for more accurate link prediction.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 26-53
Issue: 1
Volume: 17
Year: 2025
Keywords: social network; link prediction; common neighbour; similarity measure; degree centrality; node distance.
File-URL: http://www.inderscience.com/link.php?id=144611
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:1:p:26-53

Template-Type: ReDIF-Article 1.0
Author-Name: Serkan Alkan
Author-X-Name-First: Serkan
Author-X-Name-Last: Alkan
Title: Comparative analysis of distance measures in stock network construction and cluster analysis
Abstract:
The mutual information (MI) metric and the Pearson correlation metric are both widely used in cluster analysis and stock network construction. This paper presents a detailed comparison between the MI metric and the Pearson correlation metric. To detect nonlinear relationships, polynomial and natural cubic spline regressions are proposed as alternatives to the MI metric. The methodology for computing model-fitting indices for determining network adjacencies is explained in detail, along with a comparison of the results with the MI methodology. This study employs two data sets derived from the log returns of the daily adjusted closing prices of 402 stocks in the S%P500 index to measure the impact of a financial crisis on nonlinearity: one covering the crisis period from January 2007 to December 2009, and the other covering the non-crisis period between January 2012 and December 2015. The local and global properties of hierarchical stock networks are compared using the minimum spanning tree for each distance measure. The graph-theoretic internal cluster validity indices and external indices are also used to investigate the relationship between the performance of the community detection algorithm and the selection of metrics.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 75-102
Issue: 1
Volume: 17
Year: 2025
Keywords: financial networks; mutual information; Pearson correlation; regression models; community detection.
File-URL: http://www.inderscience.com/link.php?id=144614
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:1:p:75-102

Template-Type: ReDIF-Article 1.0
Author-Name: Ambily Balaram
Author-X-Name-First: Ambily
Author-X-Name-Last: Balaram
Author-Name: Nedunchezhian Raju
Author-X-Name-First: Nedunchezhian
Author-X-Name-Last: Raju
Title: A frequent itemset generation approach in data mining using transaction-labelling dynamic itemset counting method
Abstract:
A significant amount of data is generated, gathered, stored, and evaluated in real-world applications as a result of technology breakthroughs. Data mining (DM) combines a number of disciplines to efficiently discover hidden patterns from vast archives of historical information. To significantly reduce complexities associated with data, the proposed method, transaction-labelling dynamic itemset counting (TL-DIC), utilises a labelling approach on the given transactional database to logically arrange and process the underlying transactions. This method generates frequent itemsets thereby improving the performance of conventional dynamic itemset counting (DIC) method. Based on experimental findings, the average scan count in DIC and M-Apriori is 4% and 3.66%, respectively higher than TL-DIC, for different support counts. TL-DIC executes 20% and 16% quicker than DIC and M-Apriori, respectively, in terms of execution time. These results validate the proposed approach's efficacy in creating frequent itemsets from large datasets.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 54-74
Issue: 1
Volume: 17
Year: 2025
Keywords: data mining; association rule mining; ARM; dynamic itemset counting method; DIC; frequent itemset generation; transaction labelling; TL; labelling; complexities; scan count; transactional database; minimum support threshold; transaction-labelling dynamic itemset counting; TL-DIC.
File-URL: http://www.inderscience.com/link.php?id=144615
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:1:p:54-74

Template-Type: ReDIF-Article 1.0
Author-Name: Wallace Anacleto Pinheiro
Author-X-Name-First: Wallace Anacleto
Author-X-Name-Last: Pinheiro
Author-Name: Ricardo Q.A. Fernandes
Author-X-Name-First: Ricardo Q.A.
Author-X-Name-Last: Fernandes
Author-Name: Ana Bárbara Sapienza Pinheiro
Author-X-Name-First: Ana Bárbara Sapienza
Author-X-Name-Last: Pinheiro
Title: Sorting paired points: a dissimilarity measure based on sorting of series
Abstract:
We propose a new dissimilarity measure, sorting different time series and measuring their absolute and relative degree of disorganisation. This work compares this strategy with the state-of-the-art of dissimilarities or similarities measures, such as DTW, maximal information coefficient (MIC) and complexity-invariant distance (CID). Two clustering algorithms, one deterministic and one non-deterministic, K-means and hierarchical, allow us to analyse their results. To infer the accuracy, we use two different indexes, maximal HITS, and adjusted Rand index. The results of the experiments, over 128 different datasets, demonstrate that the proposed approach provides more accurate results for different domains using the proposed metrics.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 1-25
Issue: 1
Volume: 17
Year: 2025
Keywords: clustering; similarity; time series; entropy; sorting.
File-URL: http://www.inderscience.com/link.php?id=144620
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:1:p:1-25

Template-Type: ReDIF-Article 1.0
Author-Name: Nongyao Nai-arun
Author-X-Name-First: Nongyao
Author-X-Name-Last: Nai-arun
Author-Name: Warachanan Choothong
Author-X-Name-First: Warachanan
Author-X-Name-Last: Choothong
Title: Ensemble learning models for predicting the gaming addiction behaviours of adolescents
Abstract:
This paper proposes: 1) to create a prediction model for the game addiction of adolescents using six data mining algorithms; 2) to optimise the models by adjusting the parameters; 3) to create an ensemble model. Bagging and boosting algorithms were investigated for improving the models. Data were collected from eight Northern Rajabhat Universities in Thailand. The results found that bagging with neural network had shown the highest performance with an accuracy of 99.35%, followed by the boosting with neural network (99.02%), the model with the best-optimised parameters of the neural network algorithm achieved by adjusting the learning rate. The best model was used to develop a web application for predicting the gaming addiction behaviours of adolescents, which would contribute to solve the problem.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 103-125
Issue: 1
Volume: 17
Year: 2025
Keywords: classification; ensemble learning; bagging; boosting; neural network; random forest; optimisation; gaming addiction behaviours.
File-URL: http://www.inderscience.com/link.php?id=144623
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:1:p:103-125

Template-Type: ReDIF-Article 1.0
Author-Name: Hari Lal Bhaskar
Author-X-Name-First: Hari Lal
Author-X-Name-Last: Bhaskar
Title: Analysis and evaluation of business process management tools and techniques in the Industry 4.0
Abstract:
The purpose of this paper is to analyse and evaluate the different tools and techniques of business process management (BPM) as well as selection and adoption factors for process mining tools in Industry 4.0 for BPM. This paper also discusses that how tools and techniques of process mining can be used to drive the pedals of microeconomics principles. This paper discusses the core concepts of BPM and process mining tool in Industry 4.0 as well as evaluation of different types of models, etc. A tactical roadmap has been provided with a lot of comparative analysis for selecting a process mining tool or software for initiating a business process optimisation or BPR program. This work lies in the fact that how the modern-day digitally enabled organisation, Industry 4.0 to be specific, can actually benefit and re-organise its legacy systems using data-driven business insights, in order to achieve operational excellence.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 165-199
Issue: 2
Volume: 17
Year: 2025
Keywords: business process management; BPM; digital transformation; digitalisation; process mining; Industry 4.0; BPM tools; industrial internet of things; IIoT.
File-URL: http://www.inderscience.com/link.php?id=146584
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:2:p:165-199

Template-Type: ReDIF-Article 1.0
Author-Name: Mrunal Prakash Gavali
Author-X-Name-First: Mrunal Prakash
Author-X-Name-Last: Gavali
Author-Name: Abhishek Verma
Author-X-Name-First: Abhishek
Author-X-Name-Last: Verma
Title: Ensemble of large self-supervised transformers for improving speech emotion recognition
Abstract:
Speech emotion recognition (SER) is a challenging and active field of collaborative, social robotics to improve human-robot interaction (HRI) and affective computing as a feedback mechanism. More recently self-supervised learning (SSL) approaches have become an important method for learning speech representations. We present results of experiments on the challenging large-scale speech emotion RAVDESS dataset. Six very large state-of-the-art self-supervised learning transformer models were trained on the speech emotion dataset. Wav2Vec2.0-XLSR-53 was the most successful of the six level-0 models and achieved classification accuracy of 93%. We propose majority voting ensemble models that combined three and five level-0 models. The five-model and three-model majority voting ensemble models achieved 96.88% and 96.53% accuracy respectively and thereby significantly outperformed the best level-0 model and surpassed the state-of-the-art.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 217-244
Issue: 2
Volume: 17
Year: 2025
Keywords: speech emotion recognition; SER; self-supervised learning; SSL; emotion AI; transformers; speech processing; acoustic features.
File-URL: http://www.inderscience.com/link.php?id=146585
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:2:p:217-244

Template-Type: ReDIF-Article 1.0
Author-Name: Sowjanya Yerramaneni
Author-X-Name-First: Sowjanya
Author-X-Name-Last: Yerramaneni
Author-Name: Sudheer K. Reddy
Author-X-Name-First: Sudheer K.
Author-X-Name-Last: Reddy
Title: A review on breast cancer detection using machine learning techniques
Abstract:
One of the major diseases that has a high mortality rate in women is breast cancer. As the death rate of women has been increasing every year, it is necessary to decrease this number to detect the cancerous cells accurately by employing various methods. This paper presents a review of various works on the detection of breast cancer using various machine learning techniques such as decision tree, random forest, K-nearest neighbour, support vector machine, logistic regression and Na&#239;ve Bayes classifier. In addition, the paper also covers various deep neural network techniques and the comparison of various works. It follows various steps, namely pre-processing of breast image, mass detection, feature selection and image segmentation, feature extraction and classification. These steps are applied on various datasets namely, Wisconsin dataset, ImageNet, BreakHis, histopathological images and MIAS. The performance of various models has been examined and made a comparative study by considering accuracy, sensitivity and specificity metrics. Authors of this paper presented an overview of the current developments in cancer research by leveraging machine learning, deep learning and transformer models. Further, the authors also proposed the future scope of the work.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 142-164
Issue: 2
Volume: 17
Year: 2025
Keywords: breast cancer; classification models; machine learning; neural networks; deep learning.
File-URL: http://www.inderscience.com/link.php?id=146586
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:2:p:142-164

Template-Type: ReDIF-Article 1.0
Author-Name: Sena Kumcu
Author-X-Name-First: Sena
Author-X-Name-Last: Kumcu
Author-Name: Bahar Özyörük
Author-X-Name-First: Bahar
Author-X-Name-Last: Özyörük
Title: An approach to improve the healthcare purchase decision: an application in a healthcare centre in T&#252;rkiye
Abstract:
For the healthcare sector, the right supplier selection and order quantity allocation decisions for the healthcare sector are crucial because the healthcare sector must deliver its products and services to its patients properly and on time. However, in this sector, supplier selection and order allocation decision are still not given enough attention. For this reason, there is a significant research and application gap in the literature. In this study, first, in order to determine the annual purchasing needs of the medical equipment that are vital for a healthcare centre in Ankara, T&#252;rkiye, always-better-control vital-essential-desirable (ABC-VED) analysis were used. Then six different scenarios for determined vital equipments were created by using goal programming model with GAMS (24.1.3) program to help the decision maker improve the purchase decision process. This proposed approach increases the efficiency of the decision process by providing the decision maker with alternative decision plans.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 127-141
Issue: 2
Volume: 17
Year: 2025
Keywords: healthcare procurement practices; supplier selection; order allocation; goal programming; ABC-VED analysis.
File-URL: http://www.inderscience.com/link.php?id=146587
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:2:p:127-141

Template-Type: ReDIF-Article 1.0
Author-Name: Blanka Bártová
Author-X-Name-First: Blanka
Author-X-Name-Last: Bártová
Author-Name: Vladislav Bína
Author-X-Name-First: Vladislav
Author-X-Name-Last: Bína
Title: Training an artificial neural network for an effective PCB defect detection
Abstract:
The printed circuit boards (PCBs) are crucial components of most electronic devices. In the last decades, the PCBs' manufacturing process was significantly improved, mainly by surface mounted technology (SMT) and automatic optical inspection (AOI) implementation. The real data as an output from the AOI device used for our analysis have been composed in a real manufacturing company. The currently used AOI solution achieves an accuracy of 95.82%. The goal of our study was to train an artificial neural network (ANN) to detect the defect PCBs with the highest possible accuracy. Different approaches have been used for ANN training, such as the experimental approach, regression, and Taguchi method. The resulted PCA-ANN model combines principal components analysis (PCA) method for data dimensionality reduction and ANN for low quality products detection. Our proposed model increases the AOI accuracy rate by 3.95%.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 200-216
Issue: 2
Volume: 17
Year: 2025
Keywords: artificial neural network; ANN; Taguchi; printed circuit board; PCB; defect; detection; surface mounted technology; SMT; regression; data mining; networks training; quality management; Industry 4.0.
File-URL: http://www.inderscience.com/link.php?id=146588
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:2:p:200-216

Template-Type: ReDIF-Article 1.0
Author-Name: Xin Zhang
Author-X-Name-First: Xin
Author-X-Name-Last: Zhang
Author-Name: Zhixin Kang
Author-X-Name-First: Zhixin
Author-X-Name-Last: Kang
Author-Name: Guanlin Gao
Author-X-Name-First: Guanlin
Author-X-Name-Last: Gao
Author-Name: Xinyan Shi
Author-X-Name-First: Xinyan
Author-X-Name-Last: Shi
Title: Analysing and forecasting COVID-19 vaccination &#45; evidence from a Native American community in North Carolina, USA
Abstract:
This study examines the determining factors of vaccination decisions for adults and children in a historical tribal region and evaluates various machine learning models in their predicting powers. COVID-19 vaccination data were investigated; though, the proposed method may be used for evaluating other vaccination data. We administrated a survey and collected cross-sectional data (e.g., socio-demographics, COVID-19 testing behaviours, vaccination status, and people's knowledge about, attitude toward, and belief in the vaccines), developed new features and built predicting models (e.g., random forest, neural network, and decision tree), and evaluated their performance against the benchmark logistic regression models. The results show that people, who tested more frequently, believed vaccination is a social responsibility, and were provided with paid leaves from employers are more likely to be fully vaccinated and vaccinate their children. Our results also show that not all machine learning models outperform the logistic regression model.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 245-271
Issue: 3
Volume: 17
Year: 2025
Keywords: COVID-19 vaccination intention; feature design and evaluation; vaccination forecasting; machine learning; Bayesian-correlation; model evaluation.
File-URL: http://www.inderscience.com/link.php?id=148835
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:3:p:245-271

Template-Type: ReDIF-Article 1.0
Author-Name: Jyotirmayee Rautaray
Author-X-Name-First: Jyotirmayee
Author-X-Name-Last: Rautaray
Author-Name: Sangram Panigrahi
Author-X-Name-First: Sangram
Author-X-Name-Last: Panigrahi
Author-Name: Ajit Kumar Nayak
Author-X-Name-First: Ajit Kumar
Author-X-Name-Last: Nayak
Title: Multi-document text summarisation using DL-BILSTM model with hybrid algorithms
Abstract:
With the overwhelming amount of information available online, it becomes challenging for users to access relevant data. Automated techniques are essential to effectively filter and extract valuable information from vast datasets. Recently, text summarisation has emerged as a key method for distilling relevant content from lengthy documents. This work introduces a novel deep learning-based approach for multi-document text summarisation. The proposed system begins with preprocessing tasks such as stop word removal, sentence and paragraph chunking, stemming, and lemmatisation. Textual phrases are transformed into vector space models using TF-ISF and sentence scores are evaluated. A deep learning-based bidirectional long short-term memory model is employed for summarisation. Additionally, cat swarm optimisation and aquila optimisers refine DL model's parameters. The approach is validated using DUC 2002, DUC 2003, and DUC 2005 datasets, demonstrating superior performance across various metrics including Rouge scores, BLEU scores, cohesion, sensitivity, positive predictive value, and readability when compared to other summarisation methods.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 334-363
Issue: 3
Volume: 17
Year: 2025
Keywords: multi-document text summarisation; MDTS; BiLSTM; term frequency-inverse sentence frequency; deep learning; Aquila optimiser; cat swarm optimisation; CSO; natural language processing; NLP.
File-URL: http://www.inderscience.com/link.php?id=148836
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:3:p:334-363

Template-Type: ReDIF-Article 1.0
Author-Name: Bibi Saqia
Author-X-Name-First: Bibi
Author-X-Name-Last: Saqia
Author-Name: Khairullah Khan
Author-X-Name-First: Khairullah
Author-X-Name-Last: Khan
Author-Name: Atta Ur Rahman
Author-X-Name-First: Atta Ur
Author-X-Name-Last: Rahman
Title: Identifying immoral posts on social media platforms: a review
Abstract:
Social media has become an integral part of our lives, connecting people across different parts of the world. Recently, there has been an increasing concern over the proliferation of immoral content on social media platforms. The ease and speed of communication on social media have made it a popular platform for people to express their opinions. Still, it has also led to the spread of harmful and immoral content. Hate speech, cyberbullying, and other forms of immoral behaviour are common on social media platforms, which can have serious consequences for the individuals involved and the wider community. Current literature reviews have normally fixated on a specific class of immoral posts as hate speech. According to the study, no review has been dedicated to overall categories of immoral post-identification. This paper describes a systematic literature review of computational approaches, resources, challenges, and research gaps about overall categories of immoral post-identification.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 296-333
Issue: 3
Volume: 17
Year: 2025
Keywords: immoral posts; social media; cyberbullying; hate speech; challenges and issues.
File-URL: http://www.inderscience.com/link.php?id=148837
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:3:p:296-333

Template-Type: ReDIF-Article 1.0
Author-Name: Nur Izzaty
Author-X-Name-First: Nur
Author-X-Name-Last: Izzaty
Author-Name: Adelia Shinta
Author-X-Name-First: Adelia
Author-X-Name-Last: Shinta
Author-Name: Riski Arifin
Author-X-Name-First: Riski
Author-X-Name-Last: Arifin
Author-Name: Sri Rahmawati
Author-X-Name-First: Sri
Author-X-Name-Last: Rahmawati
Title: Sentiment analysis on customer reviews in Indonesian marketplace using natural language processing (a case study of organic face mask)
Abstract:
The increasing development of technology nowadays has led to the transformation of customers behaviour in purchasing products, from offline to online through marketplace. One of the most popular marketplaces in Indonesia is Shopee with the best seller skincare product is organic face mask. This study aims to analyse the sentiment of customer reviews using natural language processing (NLP) and term frequency-inversed document frequency (TF-IDF). The result revealed that from 882 reviews extracted, 89.7% was classified as positive reviews (rating 4 and 5) and the rest as much as 10.3% was the negative ones (rating 1 and 2). The sentiments were visualised using word cloud. Among the positive reviews were 'very good', 'quickly absorbed', and 'convenient'. Meanwhile, among the negative reviews were 'disappointed', 'delivery', and 'acne'. In summary, the performance metrics used for the evaluation of the classification model showed that the model accuracy reached 95%.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 364-381
Issue: 3
Volume: 17
Year: 2025
Keywords: customer reviews; natural language processing; NLP; sentiment analysis; term frequency-inverse document frequency; TF-IDF; skincare; organic face mask.
File-URL: http://www.inderscience.com/link.php?id=148839
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:3:p:364-381

Template-Type: ReDIF-Article 1.0
Author-Name: Shini Lawrance
Author-X-Name-First: Shini
Author-X-Name-Last: Lawrance
Author-Name: J.R. Jeba
Author-X-Name-First: J.R.
Author-X-Name-Last: Jeba
Title: Ensemble model with improved DCNN for big data classification by handling class imbalance problem
Abstract:
This research suggests a big data classification model that uses an improved deep convolutional neural network (IDCNN) and has five phases. In the first stage, Z-score normalisation is employed for preprocessing the input data. The second phase involves processing the preprocessed data for improved class imbalance using SMOTE-ENC. Then, the subsequent phase involves extracting the collection of features, which also includes raw data and features based on correlation, entropy, and MI. Then, in the fourth phase, to guarantee appropriate feature selection, an improved recursive feature elimination (IRFE) approach is employed for the selection of features is performed using the extracted features. Finally, ensemble classification using a collection of classifiers like Bi-LSTM, SVM, RNN and IDCNN is performed depending on the features that have been chosen. The IDCNN classifier is used in this case to categorise the final result by taking Bi-LSTM, SVM and RNN output scores as input.
Journal: Int. J. of Data Mining, Modelling and Management
Pages: 272-295
Issue: 3
Volume: 17
Year: 2025
Keywords: data; classification; class imbalance; deep convolutional neural network; DCNN; improved recursive feature elimination; IRFE.
File-URL: http://www.inderscience.com/link.php?id=148853
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:ids:ijdmmm:v:17:y:2025:i:3:p:272-295