International Journal of Bioinformatics Research and Applications (IJBRA) Inderscience Publishers - linking academia, business and industry through research

Forthcoming and Online First Articles

International Journal of Bioinformatics Research and Applications

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Articles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

We also offer which provide timely updates of tables of contents, newly published articles and calls for papers.

International Journal of Bioinformatics Research and Applications (17 papers in press)

Regular Issues

A Review of Support Vector Machine in Cancer Prediction on Genomic Data
by Revathi L, Ramaswami M. M
Abstract: Cancer is the most prevalent disease that leads to death globally. According to the World Health Organization (WHO) report, cancer claims over 10 million lives yearly. Extensive research has focused on early detection and prevention through clinical and laboratory studies. Genomic technologies enable the analysis of large cancer-related datasets, while machine learning algorithms aid in early detection. This paper explores earlier studies on supervised machine learning techniques and feature selection methods on high-dimensional gene expression data. Furthermore, this study emphasises the significance of support vector machine (SVM) in cancer prediction and diagnosis, highlighting its superior performance compared to other classification methods and in particular, the choice of kernel function strongly influences the performance of SVM. Additionally, feature selection extracts informative genes from microarray data which leads to high predictive accuracy and less computational complexity. The paper concludes that both machine learning approaches and SVM make substantial contributions to cancer prediction.
Keywords: cancer prediction; feature selection; gene expression; kernel function; machine learning; ML; supervised; support vector machine; SVM.
DOI: 10.1504/IJBRA.2024.10060711

DNA Barcoding and Annotation Study on Anisomeles malabarica (L.) R.Br. in BOLD system v.4
by Santhanalakshmi Balasubramaniam, Sivanandhan Ganeshan, Selvaraj Natesan, Kapildev Gnanajothi
Abstract: Plant DNA barcoding is primarily used for species identification, and applications in taxonomy, conservation, ecology, forensics, and certification of herbal products. This technique follows the process of DNA extraction, amplification, and sequencing without the requirement of taxonomic expertise, making it easier, faster, and reliable than conventional approaches. The Barcode of Life Data (BOLD) system was developed by the Center for Biodiversity Genomics to store and analyse the data generated from massive barcodes. This system holds 13 million barcodes in organised projects, and represents 72,000 plant species. This article discusses the features of the BOLD system v.4, with a brief review on plant DNA barcoding. Anisomeles malabarica of Lamiaceae family, is been selected as the model plant, and its barcode data are documented in the workbench of BOLD. A unique DOI 10.5883/DS-AMBC22 and accession numbers from GenBank were generated for the barcodes in the public portal for reference and retrieval.
Keywords: DNA barcoding; species identification; BOLD systems; Anisomeles malabarica; GenBank; matK; rbcL; trnH psbA; dataset; projects.
DOI: 10.1504/IJBRA.2024.10060749

Sparse representation based motor imagery EEG classification towards asynchronous BCI systems
by C. Sivananda Reddy, M.Ramasubba Reddy
Abstract: Most of the existing motor imagery (MI)-based brain-computer interface (BCI) systems operate in synchronous to the system-generated time slots. But in real-world applications, users want to control the interface asynchronously at their own convenience. The main challenge in such asynchronous BCIs lies in the detection of relax period. In this study, sparse representation-based classification (SRC) scheme is proposed for asynchronous BCI systems. The dictionary needed for the SRC scheme is learned from the extracted EEG features using the K-SVD algorithm. The proposed framework is evaluated on two benchmark datasets from BCI competitions III and IV. The results showed the SRC’s detection ability to relax states and to MI states, which is better than the detection ability of the well-known linear discriminant analysis classification method. The betterment of the proposed scheme is also shown in terms of accuracy while classifying the left-hand MI, right-hand MI, and the relaxed state.
Keywords: brain computer interface; BCI; electroencephalogram; EEG; motor imagery; MI; sparse representation based classification; SRC; dictionary learning; DL.
DOI: 10.1504/IJBRA.2024.10060881

Biomarker identification from gene expression: An effective computational pipeline
by Emon Asad, Ayatullah Faruk Mollah
Abstract: Discovering biomarkers from microarray data is an extremely important research subject, as biomarkers help to diagnose disease types, find therapeutic plans for a disease, and contain crucial biological information about organisms. In this paper, a machine learning-based two-stage biomarker identification technique for microarray datasets is presented. In the first stage, analysis of variance F-scores are applied to identify candidate biomarkers as top quartile, whereas in the second stage, performance of the possible biomarkers is examined with an ensemble classifier and the responsible biomarker(s) are identified based on their ability to characterise corresponding genetic disease(s). Interestingly, this method yields 100% classification accuracy with only one biomarker for each of the six different types of publicly available microarray datasets considered in this work, which is undoubtedly superior to many state-of-the-art methods. The selected biomarkers are also found biologically relevant and meaningful in terms of gene ontology, DisGeNET and various biochemical pathway terms.
Keywords: biomarker identification; genetic diseases; microarray gene expression; feature selection; analysis of variance; ANOVA.
DOI: 10.1504/IJBRA.2024.10061254

Diabetic Prediction Framework using Optimization Strategy via Optimal Weighted Score-based Deep Ensemble Network to Support Diabetic Patients
by Santosh Kumar Bejugam, Jyothi Vankara
Abstract: Diabetes is one of the dangerous diseases that increase blood glucose levels and it affects the patient’s life. Next, in the deep feature extraction stage, the collected data is employed as the input. Here, the deep features are extracted using one-dimensional convolutional neural network (1DCNN). Then, the acquired optimal features are offered as the input to intelligent deep ensemble network (IDENet) that holds the networks such as long short-term memory (LSTM), 1DCNN, deep temporal context networks (DTCN) and extreme learning (EL). The parameters of IDENet are tuned by enhanced light spectrum with horse herd optimisation (ELS-HHO). Further, the attained predicted values from the IDENet are fed as the input to the weighted fusion of predicted values. Then, their weights are tuned by ELS-HHO to attain the effective glucose prediction outcome. Finally, the suggested glucose prediction model secured a better prediction rate than the classical glucose prediction models in experimental observation.
Keywords: diabetics prediction framework; ELS-HHO; optimal weighted predicted scores; intelligent deep ensemble network; IDENet.
DOI: 10.1504/IJBRA.2024.10061485

In-silico Rituximab Protein Engineering to Improve Humanization and Reduce Immunogenicity
by Harit Kasana, Harish Chander, Ashwani Mathur
Abstract: Rituximab is a monoclonal antibody with a high degree of specificity towards the CD20 antigen, found on the surface of B lymphocytes used in the treatment of diverse B cell lymphomas and autoimmune disorders. Rituximab is a chimeric monoclonal antibody that reported adverse effects due to the presence of non-human sequences. Here, we attempted to improve the humanness score of rituximab using a computational approach by using combinatorial mutations at the sequence patches. These changes were imposed in: a) CDR; b) non-CDR; c) CDR + non-CDR mutants at Ala49 Gly, Thr50 Ala, and Leu53 Arg, Ser5 Thr, Ala9 Gly, and Ile10 Thr. These mutations did not affect the structural stability that was interpreted from the MD simulation analysis. However, non-CDR mutants showed marginally higher structural variation compared to CDR mutants and native rituximab. It is suggested in this study that this rational design can improve the humanness characteristics of rituximab without affecting its structural and therapeutic behaviour.
Keywords: rituximab; protein engineering; immunogenicity; molecular docking; molecular dynamic simulation.
DOI: 10.1504/IJBRA.2024.10061490

A Dense sub-graph based approach for Automatic detection of Optic Disc
by Subrata Jana, Gour Sundar Mitra Thakur, Tribeni Prasad Banerjee, Pabitra Mitra
Abstract: Glaucomas are a group of eye disorders characterized by the degeneration of the optic nerve, predominantly due to elevated intraocular pressure. Such degradation can culminate in irreversible vision loss. A significant challenge in the study and treatment of glaucomas is the accurate localization of the optic disc, a crucial anatomical landmark associated with disease progression. This research aims to present an advanced graph-based method for the automatic identification of the optic disc's exact location, addressing the existing challenges in conventional techniques. Leveraging the K-dense sub-graph approach, our method offers a novel perspective on optic disc localization. It involves the interpretation of intricate patterns in retinal images to pinpoint the affected optic disc area. When evaluated on recognized databases like DRIVE, Dristi-GS1, and STARE, our model exhibited an outstanding accuracy rate of 93% in optic disc localization. This work contributes an innovative and efficient method to the field of ophthalmological research.
Keywords: Optic Disc; K-dense sub-graph; Graph-based approach; Optic Disc Localization.
DOI: 10.1504/IJBRA.2024.10061621

Utilising Deep Convolutional Neural Networks and Hybrid Clustering Techniques for Predicting Cancer Blood Disorders
by Pulla Sujarani, M. Yogeshwari
Abstract: Blood malignancies and various blood disorders pose significant health challenges across all age groups. This study introduces adaptive fast fuzzy C means hybrid clustering (AFFCMHC) and binary adaptive Otsu (BAO) thresholding for image processing to segment cancer-related blood abnormalities. We recommend DCNNs for cancer blood abnormalities prediction. Blood illness images are filtered and enhanced in our framework. The 2D hybrid wavelet frequency domain bilateral filter (2D HWFDBF) removes noise from photographs. Denoising and 2D EPHI improve image clarity. Clustering and thresholding segment better pictures. Clustering and image thresholding are done using AFFCMHC and BAO, respectively. Features are extracted from a real-time collection of microscopic blood sample images from 1,000 cancer patients using the grey level co-occurrence matrix (GLCM). Our revolutionary DCNN classification architecture trains quickly. With 98% accuracy, our method is incredibly successful. We compare our system to existing classifiers to test its performance. We developed a complete system for segmenting and predicting cancer-related blood abnormalities, exceeding current methods.
Keywords: cancer blood disorder; deep convolutional neural networks; DCNN; classification; grey level co-occurrence matrix; GLCM; tumour recognition; medical image analysis.
DOI: 10.1504/IJBRA.2024.10061714

Analysis of Penicillin Binding Sites for Determining Antibiotic Resistance in Pakistan
by Amna Sethi, Amna Farrukh, Aena Rasheed, Eesha Adnan, Muhammad Abdullah Khan
Abstract: Antimicrobial resistance (AMR) has now become a global challenge. The increasing resistance of bacteria combined with the misuse of drugs has led to an era of antibiotics that prove to be of no value to the affected individual. This raises the question of how to encounter antimicrobial resistance (AMR) and what reason lies behind the increasing resistance of these bacteria. This research has shown the causes of resistance due to motif combinations, identified through the literature review. Two methods were employed for curation; manual and automation through a designed code. The results indicate that the decreasing number of motifs has shown higher resistance of bacteria. This study would help in identifying a mutation pattern that could encounter the antibiotic effectiveness problem. The obtained results can be further validated through experimental procedures to deduce a concrete hypothesis and to design alternatives to the existing ineffective antibiotics.
Keywords: antibiotics; antimicrobial resistance; AMR; penicillin binding sites; motifs; Pakistan.
DOI: 10.1504/IJBRA.2024.10061835

Ensemble Feature Selection (EFS) And Deep Learning Ensemble (DLE) Classifier For Cervical Cancer Diagnosis
by Anjali Kuruvilla, B. Jayanthi
Abstract: Cervical cancer has become one of the foremost causes of cancer mortality in women. Developing a new approach requires improving system performance precision. Cervical cancer has many risk factors. Cervical cancer test parameters must be considered when classifying patients based on results. Recently, cervical cancer prediction features have been assessed using several feature selection methodologies. Ensemble feature selection (EFS) outperforms individual techniques. This study classifies cervical cancer cells using a deep learning ensemble (DLE) classifier. EFS combining the results of EBFO, EEHO, and RFE yield better results than using a single FS approach. DLE classifier uses heterogeneous base learners (GAN, BGRU, and DWCNN) and a meta-learner to predict cervical cancer from risk variables. DLE classifier builds numerous basic classifiers on which a new predictor outperforms any component. Stacking trains in many models on one dataset. The model is generated by segmenting the training set again using K-fold cross-validation. The suggested system uses DLE classifier and synthetic minority oversampling technique (SMOTE). The data source has 32 features and four classes: Hinselmann, Schiller, cytology, and biopsy. Precision, recall/sensitivity, F-measure, specificity, and accuracy are calculated using a confusion matrix to determine the superiority of all classification algorithms like random forest (RF) and GAN.
Keywords: bidirectional gated recurrent unit; BGRU; cervical cancer; entropy butterfly optimisation algorithm; EBFO; ensemble feature selection.
DOI: 10.1504/IJBRA.2024.10061872

A Novel Deep Learning-Based Cardiac Disease Classification
by M. Chitra Devi, M. Ramaswami, C. Sundar
Abstract: A first-sized organ that pumps blood throughout the human body is the heart or cardiac. Cardiac disease (CAD) is one of the deadliest diseases, threatening the lives of millions of people worldwide. Nowadays, several machine learning and deep learning techniques are used for the diagnosis of cardiac disease in its early stages. The availability of large amounts of cardiac disease related medical data has aided in the development of automated machine learning and deep learning-based diagnosis systems. To overcome the limitations of traditional approaches, this paper proposes a novel deep learning PCA-1D ConvNet (one-dimensional convolution neural network) architecture for the classification of cardiac disease and non-cardiac disease and predicting the cardiac disease. The planned network achieves over 99.87% training accuracy and 99% check accuracy at the dataset along with 100% precision, 98% recall and 98.98% F1-score.
Keywords: cardiac disease; classification; deep learning; 1D ConvNet; overfitting.
DOI: 10.1504/IJBRA.2024.10061934

Uniform Distribution Tuna Swarm Optimization (UDTSO) and Deep Neural Network (DNN) for Fetal Health Classification
by JANSI B, Sumalatha V
Abstract: Foetal health is generally accessed by foetal heart rate (FHR) monitoring throughout the antepartum period. FHR analysis is a difficult and illogical process because of restricted dependability. Previous research only examined cardiotocographic (CTG) dataset classification accuracy, ignoring computational time, a critical clinical issue. Using uniform distribution tuna swarm optimisation (UDTSO), this paper selects the most important CTG traits. This study developed a machine-learning algorithm to differentiate normal and abnormal foetal CTG data. The proposed study involves pre-processing, FS, classification, and outcomes evaluation. The dataset is normalised using min-max normalisation first in pre-processing. Min-max normalisation modifies characteristics from 0 to 1 range. In the second feature selection step, the UDTSO algorithm selects a subset of input characteristics to evaluate accuracy and choose the optimum solution. Third, a deep neural network (DNN) classifies CTG recordings as normal (N), suspect (S), or pathologic (P). DNN's AlexNet-SVM captures convolution layer filter data. Max pooling minimises weights and concatenates output from a collection of neurons. The fully linked layers now have the AlexNet-SVM classifier to reduce time complexity. Classifiers are assessed on precision, recall, f-measure, and accuracy. The CTG dataset comes from UCI Machine Learning Repository.
Keywords: foetal heart rate; FHR; uniform distribution tuna swarm optimisation; UDTSO; deep neural network; DNN; support vector machine; SVM; cardiotocographic; CTG.
DOI: 10.1504/IJBRA.2024.10062096

An effective review on the prediction and analysis of infectious lung diseases using machine learning algorithms
by V. Indira, D. Annal Priyadarshini, Geetha R, V. Sujatha
Abstract: Lung diseases are very seriously increasing nowadays due to the rapid environmental changes and the variety of viruses in the universe. This review focuses on using machine learning and deep learning algorithms to predict and analyse lung cancer, tuberculosis, coronavirus disease 2019 (COVID-19), influenza, asthma, and chronic obstructive pulmonary disease (COPD). In this paper, machine learning and deep learning models are used to analyse the affected lung mortality and have decreased the amount of physical work needed. This paper inspects how numerous machine-learning algorithms can be used to discover many lung states. The main aim of this review is to envision several tempers in lung disease to analyse using machine learning and diagnose the survival issue and the domain’s feasible future. In addition, this examines the accuracy and efficiency of machine learning and deep learning categorises lung disease with minimal possible error.
Keywords: machine learning; ML; deep learning; computer vision; lung disease.
DOI: 10.1504/IJBRA.2024.10062141

A New Machine Learning Approach to Classify MRI of Brain Tumour Using SAE + LSTM
by Biswaranjan Mishra, Kakita Gopal, Bijay Paikaray, Srikant Patnaik
Abstract: A brain tumour is a serious condition that can seriously harm brain cells and eventually progress to cancer, which is life-threatening. The patient's chances of survival can be improved when the tumour stages are detected early. The proposed tumour diagnosis uses a fused feature set to increase the classifier's accuracy. To begin with, the features from the MRI images are extracted using the grey level co-occurrence matrix (GLCM) and histogram of oriented gradients (HOG). After dimensionality reduction, features are chosen with stacked autoencoder (SAE). Second, the high-level features from the MRI images are extracted using the channel-wise attention block. The long short-term memory (LSTM) is trained to produce the results of the classification using the fused features from SAE and the attention block. The proposed approach is evaluated with the BRATS dataset for the years 20181020. The accuracy attained over various datasets is 97%, 95.56% and 95.23%.
Keywords: tumour diagnosis; optimal features; stacked autoencoder; SAE; attention block; long short-term memory; LSTM.
DOI: 10.1504/IJBRA.2024.10062589

Brain Tumour Classification in MRI: Self Improved Osprey Optimised U-Net Model for Segmentation and Fused DeepNet Model based Classification
by Devisivasankari P, Lavanya K
Abstract: Tumours rank as the tenth most prevalent global cause of death, with brain tumours representing a particularly serious medical condition stemming from the uncontrolled proliferation of brain cells, disrupting normal brain functions. The causes of brain tumours are unknown, emphasising early detection, treatment, and identification. In recent decades, researchers have explored novel brain tumour diagnosis technologies. Traditional diagnoses work poorly. Our cutting-edge brain tumour classification model addresses this issue with advanced deep learning (DL) classifiers and rapid-convergence, self-improving optimisation. Our SIOO-U-Net brain tumour classification model starts with enhanced picture fusion and Wiener filtering. The pre-processed image is segmented using SIOO-U-Net. Subsequently, the segmented map is used to extract features, including improved local gradient increasing pattern (LGIP), residual network (ResNet), and Visual geometry group 16 (VGG16) features. A novel hybrid brain tumour classification model uses these derived properties. PyramidNet and Bi-GRU classifiers are used in this hybrid DL classification model. Brain tumour classification outcomes are improved by using Bruce's formula-based score-level fusion with weight initialisation conditions. We used the BraTS 2015 dataset to test our SIOO-U-Net-based brain tumour classification classifier.
Keywords: brain tumour classification; MRI; SIOO-U-Net segmentation; PyramidNet; and score level fusion; visual geometry group 16; VGG16.
DOI: 10.1504/IJBRA.2025.10062622

Meta-Heuristics for Feature Selection: A Comprehensive Survey and Comparative Analysis
by Rishika Kumar, Ashish Jain, Inderjeet Kaur
Abstract: Feature selection (FS) is a crucial step in pre-processing of data that aims to identify a subset of relevant features from a large pool of available features, while discarding irrelevant or redundant ones. From early 2000s, optimisation heuristic methods have gained popularity as an alternative to traditional FS methods. In the literature, it has been shown that the optimisation heuristics can efficiently search for a subset of relevant features that can represent the data accurately. They are flexible, scalable, and can handle non-differentiable objective functions, making them suitable for FS. In this paper, we comprehensively review those optimisation heuristics that have been developed in last one decade and applied successfully for FS. Each algorithm is elucidated theoretically, providing in-depth explanations of their methodologies. This survey presents difficulties faced by optimisation heuristic FS algorithms and prospective research directions are analysed and highlighted for the benefit of researchers working in this area.
Keywords: feature selection; optimisation heuristics; data accuracy.
DOI: 10.1504/IJBRA.2025.10062948

Content Based Medical Image Retrieval Using Multi-Feature Extraction and Patch Sorensen Similarity Indexing Technique
by K. Saminathan, Amsavalli S, M.Chithra Devi
Abstract: In the intricate field of medical imaging, the analysis of image content plays a pivotal role in classification, retrieval, and indexing tasks, as well as in recognising objects and different settings within the image. While innovative, traditional methods typically fail to efficiently and accurately process medical image databases' massive and complicated data. Due to this shortcoming, discrete wavelet coefficients-bag of visual words-contour-local binary pattern (DWC-BoVW-Contour-LBP) relevance fusion was developed. A trimmed mean filter and contrast limited adaptive histogram equalisation (CLAHE) remove noise and boost contrast to optimise the image for feature extraction in this novel method. The system carefully extracts low-level frequency features using discrete wavelet transform (DWT), textural features using local binary pattern (LBP), shape features using contour analysis, and visual features using bag of visual words (BoVW). Pixel image fusion is used to combine various features into a complete picture. Patch Sorensen similarity measurement ranks database photos by query resemblance and selects the top 10 most similar images. The algorithm's precision, F-score, and recall were superior in the TCIA-CT database, showing a substantial progress in content-based medical image retrieval (CBMIR).
Keywords: bag of visual words; BoVW; discrete wavelet transform; DWT; image retrieval.Sorensen; similarity indexing technique.
DOI: 10.1504/IJBRA.2025.10063649

Forthcoming and Online First Articles

International Journal of Bioinformatics Research and Applications

Keep up-to-date