Forthcoming Articles
International Journal of Data Mining and Bioinformatics

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.
Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.
Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.
Online First articles are also listed here. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.
Register for our alerting service, which notifies you by email when new issues are published online.
International Journal of Data Mining and Bioinformatics (9 papers in press) Special Issue on: OA Big Data Industrial Application and Computing Innovation
Abstract: The rapid expansion of online education has made the analysis of users implicit behaviours — viewed through the lens of nonlinear and complex data — a crucial avenue for enhancing educational effectiveness. To address this, we introduce a random forestfuzzy comprehensive evaluation (RF-FCE) method embedded within a clustering framework. Leveraging multiple clustering techniques, we first identify distinct category-specific influence patterns across different courses. Subsequently, we integrate fuzzy comprehensive evaluation with machine learning to analyse implicit behavioural data, examining both the intrinsic factors that affect course outcomes and the complex interactions between these factors and course quality. Our findings reveal significant variations in user engagement and learning outcomes across courses of differing quality, with these variations exerting a substantial influence on learning behaviours. In summary, this study offers a structured and robust analytical approach for examining implicit user behaviours in online education, demonstrating both methodological innovation and practical utility for improving course design and delivery. Keywords: online course; data mining technology; implicit behaviour; cluster analysis; online education; course quality evaluation; behaviour analysis. DOI: 10.1504/IJDMB.2026.10077083 Regular Issues
![]() by Qianqian Sun, Wei Qi, Fei Tan, Hongwei Zhang Abstract: This study aimed to identify key biomarkers for venous thromboembolism (VTE) following blunt trauma. Using bioinformatics analysis of public gene expression datasets (GSE19151 and GSE36809), we screened for differentially expressed genes (DEGs) and constructed co expression networks. Functional enrichment analysis revealed critical biological pathways, and a protein-protein interaction network was established to pinpoint central hub genes. Six hub genes (MRPL15, MRPL3, MYC, RPLP0, TP53, and CD3D) were identified, with MRPL3 and MYC showing promising diagnostic potential (AUC of 0.71 and 0.76, respectively). These findings suggest that these genes may serve as novel biomarkers for the early diagnosis of trauma-associated VTE, offering a foundation for future clinical validation and targeted therapeutic strategies. Keywords: venous thromboembolism; VTE; biomarkers; diagnosis. DOI: 10.1504/IJDMB.2026.10076808 Special Issue on: The Development of Novel Integrative Bioinformatics Based Machine Learning Techniques and Multi Omics Data Integration Part 2
![]() by Dhiraj Kumar Singh, Prashant Ranjan, Sahar Qazi, Bimal Prasad Jit, Amit Kumar Verma, Riyaz Ahmad Mir Abstract: Genetic alterations in normal brain cells lead to the development of brain tumours (BT). The incidence of newly diagnosed cases is on the rise over time. Understanding the molecular biology of paediatric brain tumours is crucial for advancing novel therapeutic approaches to prevent or effectively manage this disease. The R2TP complex, a conserved co-chaperone from yeast to mammals, including RUVBL1, RUVBL2, PIH1D1, and RPAP3 in humans, plays a crucial role in the assembly and maturation of various multi-subunit complexes. This study evaluates the expression of PIH1D1 and p53 in paediatric brain cancers using The Cancer Genome Atlas (TCGA) data through the UALCAN. Our analysis revealed elevated expression levels of PIH1D1 in paediatric brain tumours across all age groups compared to normal tissues, suggesting its potential as an early detection marker and a prognostic indicator. Additionally, p53 emerged as a promising target for brain tumour treatment, warranting exploration for age-specific applications. Keywords: R2TP; PIH1D1; paediatric brain tumour; TCGA; UALCAN; CBTTC. DOI: 10.1504/IJDMB.2025.10067136 ICEP and ILEP: two new approaches to identify community of complex biological network ![]() by Mamata Das, K. Selvakumar, P.J.A. Alphonse Abstract: Understanding the internal modular organisation of protein-protein interactions is crucial for deciphering molecular-level biological processes. Recognition of network communities enhances our comprehension of the biological origins of disease pathogenesis. This research introduces two innovative community detection algorithms, iterative credit-edge pruning (ICEP) and iterative load-based edge pruning (ILEP), designed to identify communities within complex biological networks. Our algorithms are evaluated using real-world data from the Omicron dataset, and their performance is compared with four established algorithms: Girvan-Newman, Louvain, Leiden, and the label propagation algorithm. Validation of the community structures is achieved through modularity. Among the techniques compared, our proposed method, ICEP, stands out with the highest modularity score of 0.885, outperforming all other approaches. The alternative method, ILEP, also achieves a notable modularity score of 0.698, surpassing the Girvan-Newman method. By implementing ICEP and ILEP, we gain profound insights into the structural organisation and interconnections within the Omicron virus. Keywords: protein interaction network; omicron; community detection; modularity; graphlet; centrality. DOI: 10.1504/IJDMB.2025.10067341 BMSD-CDE: a robust community detection ensemble method for biomarker identification ![]() by Bikash Baruah, Manash P. Dutta, Subhasish Banerjee, Dhruba K. Bhattacharyya Abstract: Community detection algorithms (CDAs) are crucial for identifying cohesive groups within complex networks. However, individual CDAs often fall short of accurately uncovering all hidden communities due to their inherent biases and limitations. These algorithms are typically designed with specific objectives, which may inadvertently lead to the oversight of certain community types, resulting in partial or imprecise outcomes. To address these limitations, we propose BMSD-community detection ensemble (CDE), a novel ensemble method that integrates six prominent CDAs - FastGreedy, Infomap, LabelProp, LeadingEigen, Louvain, and Walktrap. By strategically combining the outputs of these diverse algorithms using p-value references and elite genes, BMSD-CDE enhances the accuracy and robustness of community detection. This ensemble approach provides a more reliable foundation for downstream analyses, particularly in identifying potential biomarkers. Applied to esophageal squamous cell carcinoma (ESCC), BMSD-CDE reveals a set of genes - F2RL3, ATP6V1C2, CGN, CAD, ANGPT2, ALDH2, CLDN7, and DTX2- as potential biomarkers. These findings are supported by extensive topological and biological analyses across normal and disease conditions using four distinct datasets. Keywords: potential biomarker; community detection algorithm; CDA; ensemble algorithm; topological experiment; ESCC; biological validation; community detection ensemble; CDE. DOI: 10.1504/IJDMB.2025.10067623 Multi-epitopes prediction for designing a candidate vaccine against Ebola virus: a reverse vaccinology and immunoinformatics approach ![]() by Swati Mohanty, Himanshu Singh Abstract: Over a span of four decades, the Ebola virus disease (EVD) outbreak, has wreaked havoc starting from Central African countries through to different parts of the world including Asian countries. Guinea was the first to witness the catastrophe followed by many African and Asian countries including Liberia and Sierra Leone. In this study, the immunoinformatics approach which would include both B cell and T cell epitopes has been used for candidate vaccine development against EVD. The prediction of B cell and T cell epitopes was done by targeting the glycoprotein (GP) and VP40 proteins of Ebolavirus and an antigenic multi-epitope vaccine construct was designed. The vaccine construct was then docked with human immunogenic Toll-like Receptor 4 (TLR 4) having binding energy - 13,883.1 and in silico immune simulation was done to predict the immunogenic potential of the vaccine construct with the CAI of 0.94 and the GC content 54.35 as it showed efficient expression in Escherichia coli (E. coli) K12 strain which produced vaccine in wide scale. The Ebola virus vaccine construct designed through the immunoinformatics approach in this study could be useful in combatting EVD. Keywords: Ebola virus; epitope-based vaccine; molecular docking; immunoinformatics; reverse vaccinology. DOI: 10.1504/IJDMB.2025.10068508 Downregulation of CENPA and CCNB1 as a factor predicting the poor prognosis of acute myeloid leukaemia: a systems biological approach ![]() by Mohammad Hossein Shams, Saeid Afshar, Elmira Parto Beiragh, Azin Atabakhsh, Hassan Rafieemehr Abstract: Acute myeloid leukaemia (AML) is a complex hematologic malignancy. The present study takes a novel approach using bioinformatics to identify the primary molecular markers involved in AML pathogenesis. The differential expression of GEO microarray data (LogFC ≤ -1 / ≥1, adj. P-value ≤ 0.01, P-value ≤ 0.01) is analysed, and then the corresponding protein network (PPI) is drawn and examined using Cytoscape 3.6. The findings are validated externally and clinically using the GEPIA database and a survival curve. This study also identified important transcription factors (TF) affecting the expression of hub genes. The key finding is that the downregulation of CENPA and CCNB1 is associated with shorter overall survival in AML, with FOXM1 identified as a potential regulating TF. It is also suggests that disruption in various cellular features such as cell cycle, replication, and cell signalling may play roles in the pathogenesis of AML. Keywords: CENPA; CCNB1; systems biology; FOXM1; molecular markers; gene expression profiling. DOI: 10.1504/IJDMB.2025.10069104 Skin image analysis for detecting monkeypox disease: utilising new model M-Net, a non-invasive deep learning model ![]() by Vinod Kumar Yadav, Rajitha Bakthula Abstract: Skin and skin-related diseases pose a significant public health challenge worldwide, leading to major concerns in medical diagnosis. Various environmental factors, including bacteria, fungi, and viruses, can contribute to these conditions, resulting in a growing number of individuals affected by skin diseases. Most physicians rely on manual biopsy tests for skin disease diagnosis, which can cause delays in timely treatment. Therefore, there is a high demand for automated skin disease classification systems to provide quick and accurate results. Deep learning (DL) has recently shown remarkable effectiveness in image-based classification tasks, such as identifying skin cancer, rosacea, melanocytic nevus, tumour cells, and COVID-19 patients. Consequently, DL can also be adapted to detect monkeypox skin disease. In this article, we propose a novel approach consisting of two phases. First, new HR, UOR, and BR algorithms will be used to preprocess the images. Second, a custom CNN model will be developed for monkeypox classification. The proposed model is compared with existing approaches in the literature and demonstrates superior performance, achieving an accuracy of 95%. Keywords: image pre-processing; classification; hair removal; object removal; background removal; data augmentation. DOI: 10.1504/IJDMB.2025.10071008 Machine learning approaches for disease genes prediction ![]() by Priya Sadana, Isha Kansal, Vikas Khullar Abstract: The identification of genes involved in human hereditary diseases frequently necessitates the examination of a large number of potential candidate genes, which can be time-consuming and expensive. Genome-wide techniques such as association studies and linkage analysis frequently select many hundreds of positional candidates. Earlier binary classification methods used disease-causing and healthy genes as positive and negative training sets but risked including unknown disease genes. This work aims to discuss machine learning-based methods for disease susceptibility gene identification. Recent advancements, include complex methods like ensemble and deep learning. Then, we evaluated several well-known machine learning-based disease gene prediction algorithms. We concluded by discussing the pros and cons of different methods and their interpretability and reliability. A comparative study demonstrates the effectiveness of proposed approaches, contributing to the advancement of disease gene identification methodologies while highlighting their interpretability and reliability. Keywords: neurological disorder; gene prediction; binary classification; semi supervised learning; SSL. DOI: 10.1504/IJDMB.2025.10069769 |
Open Access