Forthcoming articles


International Journal of Data Mining and Bioinformatics


These articles have been peer-reviewed and accepted for publication in IJDMB, but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.


Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.


Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.


Articles marked with this Open Access icon are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.


Register for our alerting service, which notifies you by email when new issues of IJDMB are published online.


We also offer RSS feeds which provide timely updates of tables of contents, newly published articles and calls for papers.


International Journal of Data Mining and Bioinformatics (4 papers in press)


Regular Issues


  • Accurate Annotation of Metagenomic data without species-level references   Order a copy of this article
    by Haobin Yao, Tak-wah Lam, Hing-Fung Ting, Siu-Ming Yiu, Yadong Wang, Bo Liu 
    Abstract: In this paper, we propose a novel annotation tool, MetaAnnotator, to annotate metagenomic reads, which outperforms all existing tools significantly when only genus-level references exist in the database. From our experiments, MetaAnnotator can assign 87.5% reads correctly (67.5% reads are assigned to the exact genus) with only 8.5% reads wrongly assigned. The best existing tool (MetaCluster-TA) can only achieve 73.4% correct read assignment (with only 50.9% reads assigned to the exact genus and 22.6% reads wrongly assigned). The core concepts behind MetaAnnotator includes: (i) we only consider exact k-mers in coding regions of the references as they should be more significant and accurate; (ii) to assign reads to taxonomy nodes, we construct genome and taxonomy specific probabilistic models from the reference database; and (iii) using the BWT data structure to speed up the k-mer matching process.
    Keywords: metagenomic data analysis; binning; accurate and fast annotation.

  • A Novel Low-rank Representation Method for Identifying Differentially Expressed Genes   Order a copy of this article
    by Xiu-Xiu Xu, Ying-Lian Gao, Jin-Xing Liu, Ya-Xuan Wang, Ling-Yun Dai, Xiang-Zhen Kong, Sha-Sha Yuan 
    Abstract: Low-rank representation (LRR) has attracted lots of attentions in recent years. However, LRR has a chief shortcoming, which uses the nuclear norm to approximate the non-convex rank function. This approximation minimizes all singular values, thus the nuclear norm may not approximate to the rank function well. In this paper, we propose a novel low-rank method that replaces the nuclear norm with the truncated nuclear norm to approximate the rank function. And it is applied to identifying differentially expressed genes. The truncated nuclear norm is defined as the sum of some smaller singular values which may be a better measure to approximate the rank function than the nuclear norm. In order to achieve the convergence of our method, the optimization problem of our method is solved by the augmented Lagrange multiplier method that has the property of convergence. The experimental results demonstrate that our method exceeds LLRR, TRPCA and RPCA methods.
    Keywords: differentially expressed genes; truncated nuclear norm; low-rank; augmented Lagrange multiplier; TCGA datarn.

  • Medical Examination Data Prediction with Missing Information Imputation Based on Recurrent Neural Networks   Order a copy of this article
    by Han-Gyu Kim, Gil-Jin Jang, Ho-Jin Choi, Myungeun Lim, Jaehun Choi 
    Abstract: In this work, the recurrent neural networks (RNNs) for medical examination data prediction with missing information is proposed. Simple recurrent network (SRN), long short-term memory (LSTM) and gated recurrent unit (GRU) are selected among many variations of RNNs for the missing information imputation while they are also used to predict the future medical examination data. Besides, the missing information imputation based on bidirectional LSTM is also proposed to consider past information as well as the future information in the imputation process, while the traditional RNNs can only consider the past information during the imputation. We implemented medical examination results prediction experiment using the examination database of Koreans. The experimental results showed that the proposed RNNs worked better than the baseline linear regression method. Besides, the bidirectional LSTM performed best for missing information imputation.
    Keywords: Medical Examination Data Prediction; Recurrent Neural Network; Long Short-Term Memory; Gated Recurrent Unit; Bidirectional LSTM.

  • A hybrid-ensemble based framework for microarray data Gene selection   Order a copy of this article
    by Amirreza Rouhi, Hossein Nezamabadi-pour 
    Abstract: With the advent and propagation of high-dimensional microarray data, the process of gene selection has now become far more difficult and time-consuming, and classic feature selection methods are quickly becoming obsolete. Dealing with high-dimensional biomedical data is associated with problems such as the curse of dimensionality and increased presence of redundant and irrelevant genes, which all lead to significant rise in classification error. This paper provides a framework for combined use of ensemble and hybrid methods for gene selection in high-dimensional data with the aim of increasing classification accuracy and reducing dimensionality. The proposed method is benchmarked using several microarray datasets. The comparison results with those of latest ensemble feature selection methods confirm the good performance of the proposed approach.
    Keywords: Gene selection; Feature selection; Microarray data; Hybrid methods; Metaheuristic; Ensemble methods.