Forthcoming and Online First Articles

International Journal of Data Mining and Bioinformatics

International Journal of Data Mining and Bioinformatics (IJDMB)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

We also offer which provide timely updates of tables of contents, newly published articles and calls for papers.

International Journal of Data Mining and Bioinformatics (5 papers in press)

Regular Issues

  • Toxicity Detection of Small Drug Molecules of the Mitochondrial Membrane Potential Signalling Pathway using Bagging-based Ensemble Learning   Order a copy of this article
    by Vishan Kumar Gupta  
    Abstract: This study is focused on QSAR for the detection of chemical and drug-induced toxicities of small drug molecules of mitochondrial membrane potential (MMP) It is based on the various physicochemical properties of MMP to reduce the animal testing, time, and cost associated with risk assessment and various management factors Here is a total of 8070 drug molecules of MMP out of which 1260 drug molecules are toxic and the remaining 6810 are non-toxic Pa-DEL descriptor software is used to extract features of MMP Signalling Pathway Initially, class imbalance issue is fixed using the ensemble learning approach, and feature selection is performed using a random forest importance algorithm A bagging-based ensemble model is proposed for the toxicity detection and it is found that our proposed ensemble method based upon the voting of five base classifiers achieved 97 62% accuracy Finally, K-fold cross-validation is applied to check the consistency of the proposed model.
    Keywords: Mitochondrial Membrane Potential; Molecular descriptor; Classification; Drug Toxicity; Random forest; Feature selection; Class imbalance; Validation; Decision Tree; Ensemble Learning.
    DOI: 10.1504/IJDMB.2022.10052684
  • DSCC: a data set of cervical cell images for cervical cytology screening   Order a copy of this article
    by Hua Chen, Juan Liu, Yu Jin, Baochuan Pang, Dehua Cao, Di Xiao 
    Abstract: The lack of large-scale public data sets aiming for cytological screening of cervical cancer has hindered the research of developing robust cytological screening models. To address this problem, we develop a data set DSCC containing 15,509 cervical cell images labelled by experienced cytologists. As far as we know, the number of cell images in DSCC is nearly four times that of the largest data set known at present. Considering that the purpose of cytological screening is not for cancer diagnosis, but for judging whether the subject needs further examination, we classify the cell images into three categories: Normal, SIL (squamous intra-epithelial lesion or cancer cell, suggesting further examination), ASC (atypical squamous cell, needing to be confirmed by a professional cytologist). Furthermore, we also provide a nucleus mask map for each cell based on the annotation of the cytologists, to facilitate researchers to conduct different studies. Based on the mask map, we extract 78 features for each cell that are included in the data set as well. Experiments results demonstrate that DSCC is very useful for researchers to build classification methods for automatic cervical cytology screening.
    Keywords: Cell image; Cervical cytology screening; Data set; Classification; Machine learning.
    DOI: 10.1504/IJDMB.2022.10054155
  • Automatic Summarization of Product Reviews Using Natural Language Processing and Machine Learning Methods: A Literature Review   Order a copy of this article
    by Sonia Rani, Tarandeep Singh Walia 
    Abstract: Due to advancements in digital technologies, online shopping is trending more now. User Reviews are the foremost Aspect of understanding consumers' intentions about the products. These reviews benefit e-commerce companies and manufacturers to increase their products' productivity and business growth. Artificial intelligence and Natural Language Processing methods are more helpful in extracting vital information from user reviews. This study describes the importance of automatic product review summarization and the role of various natural language processing and machine learning methods employed to create intelligent systems. The current deep learning and transformer-based methods strongly affected to development of NLP applications. The main purpose of this study is to explore the techniques of automatic multi-document summarization, datasets, evaluation metrics, and some limitations of various researchers' studies. A comparative analysis of Rule-based and machine-learning methods is also described in this study.
    Keywords: Abstractive; Extractive; Natural language processing; Rule-based; Machine learning; Reviews summarization; Deep learning.
    DOI: 10.1504/IJDMB.2022.10054439
  • Predictive model to determine the growth of mobile money transactions in Zambia using data mining techniques.   Order a copy of this article
    by Richard Mwila, Douglas Kunda 
    Abstract: Mobile money has been known to be a successful venture around the world especially so, for African countries due to the many limitations that traditional banks have like operations, expensive transaction costs and cumbersome process to open account to mention but a few. The presence of mobile money has not only allowed the unbanked population to have accounts but has also alleviated poverty for many rural communities. Zambia has seen an increase of mobile money accounts and Covid-19 has exacerbated this increase. Therefore, this paper sought to determine Data Mining algorithm best predicts mobile money transaction growth. This paper was quantitative in nature and used aggregated monthly mobile money data (from Zambian Mobile Network Operators) from 2013 to 2020 as its sample which was collected from Bank of Zambia and Zambia Information Communications and Technology Authority. The paper further used WEKA data mining tool for data analysis following the Cross Industrial Standard Process for Data Mining guidelines.
    Keywords: Linear regression; Support vector machine; Random forest; mobile money transactions; K-nearest neighbor; multilayer perceptron.
    DOI: 10.1504/IJDMB.2022.10054589
  • Classification of TCGA related research articles based on cancer types and experimental strategy   Order a copy of this article
    by Deepika Kulshreshtha, Arindam Deb, Krishanpal Anamika 
    Abstract: The Cancer Genome Atlas (TCGA) was launched in the year 2006 to accelerate the comprehensive understanding of the genetics of cancer using advanced high throughput technologies. It has been helping to generate new cancer therapies, prognostics methods, diagnostic methods and preventive strategies. This led to an exponential increase in scientific articles utilizing and corroborating information and data from TGCA to various related studies. With this exponential increase in the number of articles, it is challenging to identify specific articles of interest for a particular cancer type. It is even more challenging to identify articles by specific data types. In this work, we have built a web-tool, CTP (Classification of TCGA Publications), to systematically classify the articles that are utilizing TCGA data. This tool enables users to access all the relevant articles available for a particular cancer type or experimental strategy utilizing TCGA data.
    Keywords: TCGA; cancer; web-tool; literature search; article classification.
    DOI: 10.1504/IJDMB.2022.10054713