Forthcoming and Online First Articles

International Journal of Bioinformatics Research and Applications

International Journal of Bioinformatics Research and Applications (IJBRA)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

We also offer which provide timely updates of tables of contents, newly published articles and calls for papers.

International Journal of Bioinformatics Research and Applications (16 papers in press)

Regular Issues

  • A Review of Support Vector Machine in Cancer Prediction on Genomic Data   Order a copy of this article
    by Revathi L, Ramaswami M. M 
    Abstract: Cancer is the most prevalent disease that leads to death globally. According to the World Health Organization (WHO) report, cancer claims over 10 million lives yearly. Extensive research has focused on early detection and prevention through clinical and laboratory studies. Genomic technologies enable the analysis of large cancer-related datasets, while machine learning algorithms aid in early detection. This paper explores earlier studies on supervised machine learning techniques and feature selection methods on high-dimensional gene expression data. Furthermore, this study emphasises the significance of support vector machine (SVM) in cancer prediction and diagnosis, highlighting its superior performance compared to other classification methods and in particular, the choice of kernel function strongly influences the performance of SVM. Additionally, feature selection extracts informative genes from microarray data which leads to high predictive accuracy and less computational complexity. The paper concludes that both machine learning approaches and SVM make substantial contributions to cancer prediction.
    Keywords: cancer prediction; feature selection; gene expression; kernel function; machine learning; ML; supervised; support vector machine; SVM.
    DOI: 10.1504/IJBRA.2024.10060711
  • DNA Barcoding and Annotation Study on Anisomeles malabarica (L.) R.Br. in BOLD system v.4   Order a copy of this article
    by Santhanalakshmi Balasubramaniam, Sivanandhan Ganeshan, Selvaraj Natesan, Kapildev Gnanajothi 
    Abstract: Plant DNA barcoding is primarily used for species identification, and applications in taxonomy, conservation, ecology, forensics, and certification of herbal products. This technique follows the process of DNA extraction, amplification, and sequencing without the requirement of taxonomic expertise, making it easier, faster, and reliable than conventional approaches. The Barcode of Life Data (BOLD) system was developed by the Center for Biodiversity Genomics to store and analyse the data generated from massive barcodes. This system holds 13 million barcodes in organised projects, and represents 72,000 plant species. This article discusses the features of the BOLD system v.4, with a brief review on plant DNA barcoding. Anisomeles malabarica of Lamiaceae family, is been selected as the model plant, and its barcode data are documented in the workbench of BOLD. A unique DOI 10.5883/DS-AMBC22 and accession numbers from GenBank were generated for the barcodes in the public portal for reference and retrieval.
    Keywords: DNA barcoding; species identification; BOLD systems; Anisomeles malabarica; GenBank; matK; rbcL; trnH psbA; dataset; projects.
    DOI: 10.1504/IJBRA.2024.10060749
  • Sparse representation based motor imagery EEG classification towards asynchronous BCI systems   Order a copy of this article
    by C. Sivananda Reddy, M.Ramasubba Reddy 
    Abstract: Most of the existing motor imagery (MI)-based brain-computer interface (BCI) systems operate in synchronous to the system-generated time slots. But in real-world applications, users want to control the interface asynchronously at their own convenience. The main challenge in such asynchronous BCIs lies in the detection of relax period. In this study, sparse representation-based classification (SRC) scheme is proposed for asynchronous BCI systems. The dictionary needed for the SRC scheme is learned from the extracted EEG features using the K-SVD algorithm. The proposed framework is evaluated on two benchmark datasets from BCI competitions III and IV. The results showed the SRC’s detection ability to relax states and to MI states, which is better than the detection ability of the well-known linear discriminant analysis classification method. The betterment of the proposed scheme is also shown in terms of accuracy while classifying the left-hand MI, right-hand MI, and the relaxed state.
    Keywords: brain computer interface; BCI; electroencephalogram; EEG; motor imagery; MI; sparse representation based classification; SRC; dictionary learning; DL.
    DOI: 10.1504/IJBRA.2024.10060881
  • Biomarker identification from gene expression: An effective computational pipeline   Order a copy of this article
    by Emon Asad, Ayatullah Faruk Mollah 
    Abstract: Discovering biomarkers from microarray data is an extremely important research subject, as biomarkers help to diagnose disease types, find therapeutic plans for a disease, and contain crucial biological information about organisms. In this paper, a machine learning-based two-stage biomarker identification technique for microarray datasets is presented. In the first stage, analysis of variance F-scores are applied to identify candidate biomarkers as top quartile, whereas in the second stage, performance of the possible biomarkers is examined with an ensemble classifier and the responsible biomarker(s) are identified based on their ability to characterise corresponding genetic disease(s). Interestingly, this method yields 100% classification accuracy with only one biomarker for each of the six different types of publicly available microarray datasets considered in this work, which is undoubtedly superior to many state-of-the-art methods. The selected biomarkers are also found biologically relevant and meaningful in terms of gene ontology, DisGeNET and various biochemical pathway terms.
    Keywords: biomarker identification; genetic diseases; microarray gene expression; feature selection; analysis of variance; ANOVA.
    DOI: 10.1504/IJBRA.2024.10061254
  • Diabetic Prediction Framework using Optimization Strategy via Optimal Weighted Score-based Deep Ensemble Network to Support Diabetic Patients   Order a copy of this article
    by Santosh Kumar Bejugam, Jyothi Vankara 
    Abstract: Diabetes is one of the dangerous diseases that increase blood glucose levels and it affects the patient’s life. Next, in the deep feature extraction stage, the collected data is employed as the input. Here, the deep features are extracted using one-dimensional convolutional neural network (1DCNN). Then, the acquired optimal features are offered as the input to intelligent deep ensemble network (IDENet) that holds the networks such as long short-term memory (LSTM), 1DCNN, deep temporal context networks (DTCN) and extreme learning (EL). The parameters of IDENet are tuned by enhanced light spectrum with horse herd optimisation (ELS-HHO). Further, the attained predicted values from the IDENet are fed as the input to the weighted fusion of predicted values. Then, their weights are tuned by ELS-HHO to attain the effective glucose prediction outcome. Finally, the suggested glucose prediction model secured a better prediction rate than the classical glucose prediction models in experimental observation.
    Keywords: diabetics prediction framework; ELS-HHO; optimal weighted predicted scores; intelligent deep ensemble network; IDENet.
    DOI: 10.1504/IJBRA.2024.10061485
  • In-silico Rituximab Protein Engineering to Improve Humanization and Reduce Immunogenicity   Order a copy of this article
    by Harit Kasana, Harish Chander, Ashwani Mathur 
    Abstract: Rituximab is a monoclonal antibody with a high degree of specificity towards the CD20 antigen, found on the surface of B lymphocytes used in the treatment of diverse B cell lymphomas and autoimmune disorders. Rituximab is a chimeric monoclonal antibody that reported adverse effects due to the presence of non-human sequences. Here, we attempted to improve the humanness score of rituximab using a computational approach by using combinatorial mutations at the sequence patches. These changes were imposed in: a) CDR; b) non-CDR; c) CDR + non-CDR mutants at Ala49 Gly, Thr50 Ala, and Leu53 Arg, Ser5 Thr, Ala9 Gly, and Ile10 Thr. These mutations did not affect the structural stability that was interpreted from the MD simulation analysis. However, non-CDR mutants showed marginally higher structural variation compared to CDR mutants and native rituximab. It is suggested in this study that this rational design can improve the humanness characteristics of rituximab without affecting its structural and therapeutic behaviour.
    Keywords: rituximab; protein engineering; immunogenicity; molecular docking; molecular dynamic simulation.
    DOI: 10.1504/IJBRA.2024.10061490
  • A Dense sub-graph based approach for Automatic detection of Optic Disc   Order a copy of this article
    by Subrata Jana, Gour Sundar Mitra Thakur, Tribeni Prasad Banerjee, Pabitra Mitra 
    Abstract: Glaucomas are a group of eye disorders characterized by the degeneration of the optic nerve, predominantly due to elevated intraocular pressure. Such degradation can culminate in irreversible vision loss. A significant challenge in the study and treatment of glaucomas is the accurate localization of the optic disc, a crucial anatomical landmark associated with disease progression. This research aims to present an advanced graph-based method for the automatic identification of the optic disc's exact location, addressing the existing challenges in conventional techniques. Leveraging the K-dense sub-graph approach, our method offers a novel perspective on optic disc localization. It involves the interpretation of intricate patterns in retinal images to pinpoint the affected optic disc area. When evaluated on recognized databases like DRIVE, Dristi-GS1, and STARE, our model exhibited an outstanding accuracy rate of 93% in optic disc localization. This work contributes an innovative and efficient method to the field of ophthalmological research.
    Keywords: Optic Disc; K-dense sub-graph; Graph-based approach; Optic Disc Localization.
    DOI: 10.1504/IJBRA.2024.10061621
  • Utilising Deep Convolutional Neural Networks and Hybrid Clustering Techniques for Predicting Cancer Blood Disorders   Order a copy of this article
    by Pulla Sujarani, M. Yogeshwari 
    Abstract: Blood malignancies and various blood disorders pose significant health challenges across all age groups. This study introduces adaptive fast fuzzy C means hybrid clustering (AFFCMHC) and binary adaptive Otsu (BAO) thresholding for image processing to segment cancer-related blood abnormalities. We recommend DCNNs for cancer blood abnormalities prediction. Blood illness images are filtered and enhanced in our framework. The 2D hybrid wavelet frequency domain bilateral filter (2D HWFDBF) removes noise from photographs. Denoising and 2D EPHI improve image clarity. Clustering and thresholding segment better pictures. Clustering and image thresholding are done using AFFCMHC and BAO, respectively. Features are extracted from a real-time collection of microscopic blood sample images from 1,000 cancer patients using the grey level co-occurrence matrix (GLCM). Our revolutionary DCNN classification architecture trains quickly. With 98% accuracy, our method is incredibly successful. We compare our system to existing classifiers to test its performance. We developed a complete system for segmenting and predicting cancer-related blood abnormalities, exceeding current methods.
    Keywords: cancer blood disorder; deep convolutional neural networks; DCNN; classification; grey level co-occurrence matrix; GLCM; tumour recognition; medical image analysis.
    DOI: 10.1504/IJBRA.2024.10061714
  • Analysis of Penicillin Binding Sites for Determining Antibiotic Resistance in Pakistan   Order a copy of this article
    by Amna Sethi, Amna Farrukh, Aena Rasheed, Eesha Adnan, Muhammad Abdullah Khan 
    Abstract: Antimicrobial resistance (AMR) has now become a global challenge. The increasing resistance of bacteria combined with the misuse of drugs has led to an era of antibiotics that prove to be of no value to the affected individual. This raises the question of how to encounter antimicrobial resistance (AMR) and what reason lies behind the increasing resistance of these bacteria. This research has shown the causes of resistance due to motif combinations, identified through the literature review. Two methods were employed for curation; manual and automation through a designed code. The results indicate that the decreasing number of motifs has shown higher resistance of bacteria. This study would help in identifying a mutation pattern that could encounter the antibiotic effectiveness problem. The obtained results can be further validated through experimental procedures to deduce a concrete hypothesis and to design alternatives to the existing ineffective antibiotics.
    Keywords: antibiotics; antimicrobial resistance; AMR; penicillin binding sites; motifs; Pakistan.
    DOI: 10.1504/IJBRA.2024.10061835
  • Ensemble Feature Selection (EFS) And Deep Learning Ensemble (DLE) Classifier For Cervical Cancer Diagnosis   Order a copy of this article
    by Anjali Kuruvilla, B. Jayanthi 
    Abstract: Cervical cancer has become one of the foremost causes of cancer mortality in women. Developing a new approach requires improving system performance precision. Cervical cancer has many risk factors. Cervical cancer test parameters must be considered when classifying patients based on results. Recently, cervical cancer prediction features have been assessed using several feature selection methodologies. Ensemble feature selection (EFS) outperforms individual techniques. This study classifies cervical cancer cells using a deep learning ensemble (DLE) classifier. EFS combining the results of EBFO, EEHO, and RFE yield better results than using a single FS approach. DLE classifier uses heterogeneous base learners (GAN, BGRU, and DWCNN) and a meta-learner to predict cervical cancer from risk variables. DLE classifier builds numerous basic classifiers on which a new predictor outperforms any component. Stacking trains in many models on one dataset. The model is generated by segmenting the training set again using K-fold cross-validation. The suggested system uses DLE classifier and synthetic minority oversampling technique (SMOTE). The data source has 32 features and four classes: Hinselmann, Schiller, cytology, and biopsy. Precision, recall/sensitivity, F-measure, specificity, and accuracy are calculated using a confusion matrix to determine the superiority of all classification algorithms like random forest (RF) and GAN.
    Keywords: bidirectional gated recurrent unit; BGRU; cervical cancer; entropy butterfly optimisation algorithm; EBFO; ensemble feature selection.
    DOI: 10.1504/IJBRA.2024.10061872
  • A Novel Deep Learning-Based Cardiac Disease Classification   Order a copy of this article
    by M. Chitra Devi, M. Ramaswami, C. Sundar 
    Abstract: A first-sized organ that pumps blood throughout the human body is the heart or cardiac. Cardiac disease (CAD) is one of the deadliest diseases, threatening the lives of millions of people worldwide. Nowadays, several machine learning and deep learning techniques are used for the diagnosis of cardiac disease in its early stages. The availability of large amounts of cardiac disease related medical data has aided in the development of automated machine learning and deep learning-based diagnosis systems. To overcome the limitations of traditional approaches, this paper proposes a novel deep learning PCA-1D ConvNet (one-dimensional convolution neural network) architecture for the classification of cardiac disease and non-cardiac disease and predicting the cardiac disease. The planned network achieves over 99.87% training accuracy and 99% check accuracy at the dataset along with 100% precision, 98% recall and 98.98% F1-score.
    Keywords: cardiac disease; classification; deep learning; 1D ConvNet; overfitting.
    DOI: 10.1504/IJBRA.2024.10061934
  • Uniform Distribution Tuna Swarm Optimization (UDTSO) and Deep Neural Network (DNN) for Fetal Health Classification   Order a copy of this article
    by JANSI B, Sumalatha V 
    Abstract: Foetal health is generally accessed by foetal heart rate (FHR) monitoring throughout the antepartum period. FHR analysis is a difficult and illogical process because of restricted dependability. Previous research only examined cardiotocographic (CTG) dataset classification accuracy, ignoring computational time, a critical clinical issue. Using uniform distribution tuna swarm optimisation (UDTSO), this paper selects the most important CTG traits. This study developed a machine-learning algorithm to differentiate normal and abnormal foetal CTG data. The proposed study involves pre-processing, FS, classification, and outcomes evaluation. The dataset is normalised using min-max normalisation first in pre-processing. Min-max normalisation modifies characteristics from 0 to 1 range. In the second feature selection step, the UDTSO algorithm selects a subset of input characteristics to evaluate accuracy and choose the optimum solution. Third, a deep neural network (DNN) classifies CTG recordings as normal (N), suspect (S), or pathologic (P). DNN's AlexNet-SVM captures convolution layer filter data. Max pooling minimises weights and concatenates output from a collection of neurons. The fully linked layers now have the AlexNet-SVM classifier to reduce time complexity. Classifiers are assessed on precision, recall, f-measure, and accuracy. The CTG dataset comes from UCI Machine Learning Repository.
    Keywords: foetal heart rate; FHR; uniform distribution tuna swarm optimisation; UDTSO; deep neural network; DNN; support vector machine; SVM; cardiotocographic; CTG.
    DOI: 10.1504/IJBRA.2024.10062096
  • An effective review on the prediction and analysis of infectious lung diseases using machine learning algorithms   Order a copy of this article
    by V. Indira, D. Annal Priyadarshini, Geetha R, V. Sujatha 
    Abstract: Lung diseases are very seriously increasing nowadays due to the rapid environmental changes and the variety of viruses in the universe. This review focuses on using machine learning and deep learning algorithms to predict and analyse lung cancer, tuberculosis, coronavirus disease 2019 (COVID-19), influenza, asthma, and chronic obstructive pulmonary disease (COPD). In this paper, machine learning and deep learning models are used to analyse the affected lung mortality and have decreased the amount of physical work needed. This paper inspects how numerous machine-learning algorithms can be used to discover many lung states. The main aim of this review is to envision several tempers in lung disease to analyse using machine learning and diagnose the survival issue and the domain’s feasible future. In addition, this examines the accuracy and efficiency of machine learning and deep learning categorises lung disease with minimal possible error.
    Keywords: machine learning; ML; deep learning; computer vision; lung disease.
    DOI: 10.1504/IJBRA.2024.10062141
  • A New Machine Learning Approach to Classify MRI of Brain Tumour Using SAE + LSTM   Order a copy of this article
    by Biswaranjan Mishra, Kakita Gopal, Bijay Paikaray, Srikant Patnaik 
    Abstract: A brain tumour is a serious condition that can seriously harm brain cells and eventually progress to cancer, which is life-threatening. The patient's chances of survival can be improved when the tumour stages are detected early. The proposed tumour diagnosis uses a fused feature set to increase the classifier's accuracy. To begin with, the features from the MRI images are extracted using the grey level co-occurrence matrix (GLCM) and histogram of oriented gradients (HOG). After dimensionality reduction, features are chosen with stacked autoencoder (SAE). Second, the high-level features from the MRI images are extracted using the channel-wise attention block. The long short-term memory (LSTM) is trained to produce the results of the classification using the fused features from SAE and the attention block. The proposed approach is evaluated with the BRATS dataset for the years 20181020. The accuracy attained over various datasets is 97%, 95.56% and 95.23%.
    Keywords: tumour diagnosis; optimal features; stacked autoencoder; SAE; attention block; long short-term memory; LSTM.
    DOI: 10.1504/IJBRA.2024.10062589
  • Brain Tumour Classification in MRI: Self Improved Osprey Optimised U-Net Model for Segmentation and Fused DeepNet Model based Classification   Order a copy of this article
    by Devisivasankari P, Lavanya K 
    Abstract: Tumours rank as the tenth most prevalent global cause of death, with brain tumours representing a particularly serious medical condition stemming from the uncontrolled proliferation of brain cells, disrupting normal brain functions. The causes of brain tumours are unknown, emphasising early detection, treatment, and identification. In recent decades, researchers have explored novel brain tumour diagnosis technologies. Traditional diagnoses work poorly. Our cutting-edge brain tumour classification model addresses this issue with advanced deep learning (DL) classifiers and rapid-convergence, self-improving optimisation. Our SIOO-U-Net brain tumour classification model starts with enhanced picture fusion and Wiener filtering. The pre-processed image is segmented using SIOO-U-Net. Subsequently, the segmented map is used to extract features, including improved local gradient increasing pattern (LGIP), residual network (ResNet), and Visual geometry group 16 (VGG16) features. A novel hybrid brain tumour classification model uses these derived properties. PyramidNet and Bi-GRU classifiers are used in this hybrid DL classification model. Brain tumour classification outcomes are improved by using Bruce's formula-based score-level fusion with weight initialisation conditions. We used the BraTS 2015 dataset to test our SIOO-U-Net-based brain tumour classification classifier.
    Keywords: brain tumour classification; MRI; SIOO-U-Net segmentation; PyramidNet; and score level fusion; visual geometry group 16; VGG16.
    DOI: 10.1504/IJBRA.2025.10062622
  • Meta-Heuristics for Feature Selection: A Comprehensive Survey and Comparative Analysis   Order a copy of this article
    by Rishika Kumar, Ashish Jain, Inderjeet Kaur 
    Abstract: Feature selection (FS) is a crucial step in pre-processing of data that aims to identify a subset of relevant features from a large pool of available features, while discarding irrelevant or redundant ones. From early 2000s, optimisation heuristic methods have gained popularity as an alternative to traditional FS methods. In the literature, it has been shown that the optimisation heuristics can efficiently search for a subset of relevant features that can represent the data accurately. They are flexible, scalable, and can handle non-differentiable objective functions, making them suitable for FS. In this paper, we comprehensively review those optimisation heuristics that have been developed in last one decade and applied successfully for FS. Each algorithm is elucidated theoretically, providing in-depth explanations of their methodologies. This survey presents difficulties faced by optimisation heuristic FS algorithms and prospective research directions are analysed and highlighted for the benefit of researchers working in this area.
    Keywords: feature selection; optimisation heuristics; data accuracy.
    DOI: 10.1504/IJBRA.2025.10062948