Forthcoming articles


International Journal of Computational Biology and Drug Design


These articles have been peer-reviewed and accepted for publication in IJCBDD, but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.


Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.


Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.


Articles marked with this Open Access icon are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.


Register for our alerting service, which notifies you by email when new issues of IJCBDD are published online.


We also offer RSS feeds which provide timely updates of tables of contents, newly published articles and calls for papers.


International Journal of Computational Biology and Drug Design (28 papers in press)


Regular Issues


  • Data Acquisition and Electrical Instrumentation Engineering Modelling for Intelligent Learning and Recognition
    by Jun Qin, Yuhao Jiang 

  • Development of interactive computer learning program for genetics and molecular biology applications
    by Xiaoli Yang, Bin Chen, Yifan Cai, Charles Tseng 

    by Lilly Saleena, Priya Swaminathan 
    Abstract: Abstract Malaria still remains one of the challenging and dominant public health issue infecting about 300-500 millions of people and killing about three million people. The most serious and fatal malarial infections are caused by Plasmodium falciparum and the parasite has developed resistance to commonly employed therapeutics. Hence the objective is to develop a novel anti malarial drug targeting Dihydroorotate dehydrogenase (DHODH), an enzyme involved in pyrimidine biosynthesis, essential for parasite growth. A decrease in parasite growth correlated with a decrease in levels of DHODH mRNA. Thus targeting this leads to a potential anti malarial drug. DHODH is existing in both humans and Plasmodium falciparum. Targeting the former leads to discovery of anti proliferative and anti inflammatory agents. Sequence analysis and structure comparison of DHODH of both Human and Plasmodium falciparum reveals similarities and variations among them there by providing a chance to design a specific inhibitor. High throughput virtual screening of the existing anti-malarial drugs acting on DHODH is to be performed from pubchem and BindingDB databases. Pharmacophore mapping and searching was done for the top twenty virtual screening compounds using hip hop algorithm. The compounds thus obtained are to be docked with both Human and Plasmodium DHODH. Comparing the results helped in the identification of inhibitors specific to Plasmodium and Human DHODH. Potential anti malarial and anti inflammatory lead compounds that has similar structure to the specific inhibitors can be further developed to cure naturally resistant strains of Plasmodium falciparum
    Keywords: Plasmodium falciparum; DHODH; anti malarial; anti inflammatory; Pharmacophore.

  • In-silico mutational study of ferulic acid decarboxylase for improvement of substrate binding empathy   Order a copy of this article
    by Pravin Kumar, Shashwati Ghosh Sachan, Raju Poddar 
    Abstract: Biotransformation of ferulic acid by microorganisms provides a better alternative for production of flavor and fragrance compounds like 4-vinylguaiacol and vanillin. Ferulic acid is transformed to 4-vinylguaiacol using the non-oxidative decarboxylation pathway by Ferulic Acid Decarboxylase (FADase). Here we report, computational mutational analysis of active site of FADase. Site directed mutations (single nucleotide polymorphisms, SNPs) were commenced using in-silico molecular modeling methods. Energy minimization, dynamic cross-correlation map (DCCM) and principle components analysis (PCA) methods were subsequently applied to validate different conformers (SNPs) of FADase. Substrate ferulic acid was docked with different SNPs. It was observed that, certain amino acids like Tyr21, Trp25, Tyr27 and Glu134 at active sites are responsible for better binding to ferulic acid. Further, mutated form Y27F (Tyr27Phe) of FADase shows a better binding affinity towards ferulic acid than its native form through structure analysis and docking studies.
    Keywords: Ferulic Acid Decarboxylase; Enzyme modeling; site directed mutation; DCCM; PCA; docking.

  • Flexible Molecular Docking: Application of Hybrid Tabu-Simplex Optimization   Order a copy of this article
    by Ghania KHENSOUS, Belhadri MESSABIH, Abdallah CHOUARFIA, Bernard MAIGRET 
    Abstract: In this paper, we present a molecular docking method to predict the optimal binding pose of a flexible ligand in a flexible protein-binding pocket. For this purpose, a Tabu global search optimization algorithm is used, and the best Tabu solutions are then refined using the Nelder-Mead Simplex local search optimization algorithm. Most docking methods use scoring functions to approximate the binding affinity between the two molecular partners. In our application, the intra-molecular and intermolecular energies are calculated explicitly from a classical molecular mechanics model, which includes polarization terms. The variables of our optimization problem are the ligand positions (Euler angles + translation vector), the ligand and the protein side chains dihedral angles instead of the Cartesian coordinates in order to reduce the problem dimensionality. While the GOLD software (GOLD for Genetic Optimization for Ligand Docking) is usually considered as a standard in molecular docking, our docking approach is illustrated on four protein/ligand complexes for which GOLD failed, suggesting that the proposed method is promising.
    Keywords: Drug Design; Metaheuristic Optimization; Protein-Ligand Docking; Simplex Algorithm; Tabu Search Algorithm.

  • Interaction studies of Angelica polymorpha and Beilschmiedia pulverulenta phytochemicals with acetylcholinesterase as anti-Alzheimers disease target   Order a copy of this article
    by Tomisin Happy Ogunwa 
    Abstract: Angelica polymorpha and Beilschmiedia pulverulenta are medicinal plants locally used by people in some parts of Asia and Africa due to their beneficial health effects particularly in the treatment of Alzheimers disease (AD). The phytoconstituents responsible for such bioactivity have recently been identified in the plants. Herein, in silico approach was used to explore the interaction of such phytochemicals with acetylcholinesterase (AChE) as a validated target in the treatment of AD to provide insights into their precise binding pattern and affinity, order of chemical interaction, inhibitory potential and residues that contribute to the enzyme-phytoconstituent complex stability. With binding affinity ranging from -7.0 kcal/mol to -10.2 kcal/mol and tacrine-comparable orientation, the chemical scaffold of the phytochemicals from both plants displayed deep penetration and fit conveniently into the narrow gorge of AChE. Optimization of these ligands scaffold might yield new AChE inhibitors with desirable higher efficacy.
    Keywords: Phytoconstituents; Angelica polymorpha; Beilschmiedia pulverulenta; Molecular interaction; Docking.

  • Human Caveolin-1 a potent inhibitor for prostate cancer therapy: a computational approach   Order a copy of this article
    by Uzma Khanam, Balwant Kishan Malik, Puniti Mathur, Bhawna Rathi 
    Abstract: Caveolin-1 (Cav-1) is 22 kDa caveolae protein, acts as a scaffold within caveolar membranes. It interacts with alpha subunits of G-protein and thereby regulates their activity. Earlier studies reported elevated or up-regulated levels of caveolin-1 in the serum of prostate cancer patients. Secreted Cav-1 promotes angiogenesis, cell proliferation and anti-apoptotic activities in prostate cancer patients. Cav-1 upregulation is mainly related to prostate cancer metastasis. Keeping above facts in view, the present study was designed to explore Cav-1 as a target for prostate cancer therapy using computational approach. Molecular docking, structural base molecular modelling and molecular dynamics simulations were performed to investigate Cav-1 inhibitors. A predictive model was generated and validated to establish a stable structure. ZINC database of biogenic compounds was used for induced fit docking (IFD) and high throughput virtual screening. The H-bond interactions of the compounds with active site residues of Cav-1 was estimated by IFD and 100 ns long molecular dynamic simulations. The reported compounds showed significant binding and thus can be considered as potent therapeutic inhibitors of Cav-1. This study provides a valuable insight into biochemical interactions of Cav-1 for therapeutic applications and warrants for experimental validation of the predicted active(s).
    Keywords: Molecular dynamics simulation; virtual screening; molecular docking; prostate cancer; caveolin-1; induced fit docking; protein-protein interaction network.

  • An in silico approach to design a potential drug for Haemophilia A   Order a copy of this article
    by Srishti Munjal, Gaurav Jaisawal, Navodit Goel, Udai Pratap Singh, Ajay Vishwakrma, Abhinav Srivastava 
    Abstract: Haemophilia A has been known as a disease since the late 20th century but till date, there has not been developed a cure for it. Treatments that temporarily relieve bleeding episodes include new factor replacement therapies with longer half-lives delaying the frequency of blood transfusions. There is a need to devise a new drug for the same. in silico drug designing comes as a powerful tool in designing a molecule to be used as drug in comparatively less time. In this study a new drug molecule was designed using Bioinformatic tools. The causative gene was found out to be X-linked F8 and the corresponding protein as coagulation factor VIII. Material and Methods: Target proteins were identified from protein databases and their structures were observed. Cavities in the protein were determined using SPDBV (Swiss PDB Viewer). Ligands and its isomers, following the Lipinskis rule of five, were prepared through Molinspiration. Docking between the ligands and target proteins were performed using Molegro Virtual Docker. Results: Thirteen proteins were selected and twelve ligands were prepared. Docking studies were performed and two criteria were compared MolDock score and hydrogen bond score. The most appropriate values as -836.722 for MolDock score and -55.02 for H-Bond score were obtained with 1SDD and ligand 1.
    Keywords: Haemophilia A; Factor VIII; X-linked disease; drug designing.

  • De novo Drug Design, Pharmacophore Search and Molecular Docking for Inhibitors to treat TB and HIV co-infection   Order a copy of this article
    by Satheeshkumar Sellamuthu, Ashok Kumar, Sushil Singh 
    Abstract: Novel molecules were designed as possible inhibitors of ATP synthase through de novo drug design, but were not drug-like molecules. Hence, ZINC database was searched for drug-like molecules from the common pharmacophore of the designed molecules. A total of 472 hits were obtained, among them, ZINC39552534, ZINC39371747, and ZINC38959526 produced better docking than the standard drug Bedaquiline. The vulnerability of TB and HIV co-infection has necessitated the search for inhibitors effective against both the diseases. Hence, the hits obtained were further screened for possible interaction with HIV reverse transcriptase. ZINC63941671, ZINC05858010, and ZINC05857787 were found better over the standard drug Rilpivirine, but their interaction was least against ATP synthase. Further, ZINC38959526 (lead against ATP synthase) and ZINC05858010 (lead against reverse transcriptase) share some common chemical features and based on this, new hybrid molecules were designed to inhibit both the targets. The possibility of hERG toxicity was also checked to eliminate unwanted cardiotoxicity.
    Keywords: ATP synthase inhibitors; De novo drug design; HIV; hERG toxicity; Molecular docking; Pharmacophore search; Reverse transcriptase; Tuberculosis; ZINC database.

  • Protein Interaction Network (PIN) analysis of TGF- signaling pathway enabled EMT process to anticipate the anticancer activity of curcumin   Order a copy of this article
    by Shivananda Kandagalla, Sharath B S, Bharath B R, Manjunatha H 
    Abstract: TGF-β signaling is a key mediator of EMT process and its up-regulation is identified as a hallmark of metastasis. Since TGF-β signaling pathway is known as a key therapeutic target in the treatment of EMT enabled cancer and the study aims at identification of key EMT genes by gene annotation tools and protein interaction network (PIN) to analyze the regulatory dynamics of an interactome. Meanwhile, the potency of curcumin against TGF-β signaling was evaluated by network pharmacology approach. Resultantly, fifteen genes were identified as key regulators of TGF-β signaling pathway and seven were shortlisted as leading curcumin targets. Cumulatively, both approaches have justified the role of targets. Thus, curcumin was subjected to molecular docking with targets using AutoDock Vina. Wherein, curcumin has shown significant binding energy with targets EP300 and JUN (-7.1 and -6.4 kcal/mol) respectively indicating the potential anticancer property.
    Keywords: EMT; TGF- β; Cancer; PIN; Ep300 and Molecular docking.

  • Improving the nerve regeneration ability by inhibiting the orchestral activity of the myelin associated repair inhibitors: An In Silico Approach   Order a copy of this article
    by Sumaira Kanwal, Shazia Perveen 
    Abstract: Spinal cord injury (SCI) causes severe neurological modifications that significantly interrupt the physical, emotional and economical stability of affected individuals. Unluckily, the repairing ability of the central nervous system is very restricted because of reduced intrinsic growth capacity and non-permissive environment for axonal elongation. After injury, axonal regeneration of the adult central nervous system (CNS) is inhibited by myelin-derived growth-suppressing proteins. On contrary the regeneration capability of axons in peripheral nervous system is much better. These axonal growth inhibitory proteins are mediated via activation of Rho, a small GTP-binding protein.Reticulen4, myelin associated glycoprotein and Oligodendrocyte-myelin glycoprotein are the most influential axonal regeneration inhibitors. In the present study, a hybrid approach of comparative modeling and molecular docking followed by inhibitor identification and structure modeling was employed. Docking analysis showed that the two important drugs which are widely used have the potential to block the Rho-Rock pathways. Here, we report inhibitors which showed maximum binding affinity for the three most important axonal regeneration inhibitors. These two compounds at three stages and can block the activity of the inhibitors of axon regeneration. Three step approaches can be used to defeat the axonal neuropathies that especially in the CMT disease. However further studies are required to find the applications of these drugs.
    Keywords: Axonapathy; CMT2; NOGO; Rho-Rock pathways; Nonsteroidal anti-inflammatory drugs,Neurological disorder; Spinal Cord Injury; Multiple Sclerosis.

  • Deep Convolutional Neural Network for Laser Forward Scattering Image Classification in Microbial Source Tracking   Order a copy of this article
    by Bin Chen 
    Abstract: The colony-based laser scatter imaging for microbial source tracking heavily relies on the power of optical scattering image classification. While carefully handcraft feature extraction achieved excellent results for the colonies with certain sizes for optimal classification results, the classification accuracy drops quickly for smaller or larger colonies outside of the colony size range. In this study, a deep convolutional neural network was implemented for laser scattering image feature extraction and classification. The results show that the deep learning classification method clearly outperforms the traditional clustering methods with high accuracy and consistency for host species with a wide range of colony sizes. It also provides comparable accuracy for the colonies with the optimal sizes.
    Keywords: deep learning; convolutional neural network; microbial source tracking; laser imaging.

  • Computational prediction of binding of monocrotophos and its analogues on Human acetylcholine esterase, oxyhaemoglobin and IgE antibody   Order a copy of this article
    by Nathiya Soundararajan, Durga Mohan, Devasena Thiyagarajan 
    Abstract: In the present study, computational approach has been employed to study the interactions of human acetylcholine esterase, human oxyhaemoglobin and human high-affinity IgE receptor with an organophosphate pesticides and the comparative binding affinity, interacting residues of protein, H-bond distance and fitness score has been evaluated using GOLD software. Monocrotophos and its analogs bind to AchE with the highest fitness score. The analog RPR-II binds to the receptor with a highest fitness score: 42.17 when compared to RPR-V (fitness score: 40.62) and monocrotophos (fitness score: 35.25). Monocrotophos, RPR-II and RPR-V interact with oxyhaemoglobin with a fitness score of about 17.68, 20.16 and 24.62 respectively. Monocrotophos, RPR-II and RPR-V interact with human high-affinity IgE receptor with a fitness score of about 18.29, 19.05 and 22.57 respectively. The above results indicate that RPR series are highly toxic than monocrotophos, hence there is need for complete evaluation of the toxicological effect of new pesticides.
    Keywords: Monocrotophos; RPR series; Acetylcholine esterase; Oxyhaemoglobin; IgE receptor; Toxicology.

  • The unique QA domain of Runx2 causes conformational change in the Runt DNA binding domain which may result in alteration in its function   Order a copy of this article
    by Arpita Devi 
    Abstract: Runt-related transcription factors (RUNX) are a family of proteins expressed by RUNX genes. In mammals, there are three members in this family- RUNX1, RUNX2 and RUNX3. There is high sequence similarity in the three members. However, there is a presence of QA domain in the N-terminal of Runx2. The structural aspect of this domain has not been elucidated till now. Here, we model the structures RUNX1, RUNX2 and RUNX2 without the QA domain (RUNX2Δqa)from its N-terminal to DNA binding domain. It has been found that there is a significant difference in structure of RUNX2 and RUNX2Δqa. The structure of RUNX2Δqa resembles that of RUNX1. Also, RUNX2Δqa seems to bind to the consensus DNA sequence of RUNX1 with higher affinity than that of RUNX2. The presence of QA domain also decreases the affinity of Runx2 towards CBFbeta. Thus, we find that the QA domain structurally and functionally diverts RUNX2 from that of RUNX1.
    Keywords: Runx; Docking; Molecular dynamics simulation; QA domain.

  • Exploration of Cyclooxygenase-1 Binding modes of some Chiral Anti-inflammatory Drugs using Molecular Docking and Dynamic Simulations   Order a copy of this article
    by Meriem Meyar, Samira Feddal, Zohra Bouakouk, Safia Kellou-Tairi 
    Abstract: The profens represent an important class of chiral anti-inflammatory drugs. They are often marketed as racemic mixtures, but one of their enantiomers R or S can be inactive or toxic. With the aim of evaluating the anti-inflammatory activity of each enantiomer, it would be useful to first theoretically predict the enantiomer responsible for this activity. For that, three well known profens: ibuprofen, flurbiprofen, naproxen and some of their derivatives have been selected from the literature and were studied through docking and molecular dynamic (MD) simulations. Analysis of the recognition modes, through interactions with relevant residues of the cyclooxygenase-1(COX-1), can predict and explain which enantiomer is the most active. MD study highlights that water molecules play an important role in ligand-receptor interactions. Also, our combined study showed the preference of the profen's S-enantiomer towards the COX-1 active site in contrast to R-enantiomer.
    Keywords: COX-1; Profens; Chiral NSAIDs; Molecular docking; MD simulations; Binding Modes.

  • A Novel Approach for Identification of possible GSK-3 inhibitors using computational virtual screening analysis of Drugs   Order a copy of this article
    Abstract: GSK-3 has a prominent role in glucose uptake and was investigated using more specific, ATP-competitive GSK-3 inhibitors. This multifunctional kinase apart from the ability to phosphorylate glycogen synthase and regulate glucose metabolism was subsequently found to be a critical component in numerous cellular functions including regulation of different cell signaling, cell division, differentiation, proliferation and growth as well as apoptosis. In this work, we report molecular docking analysis of 2035 approved drugs from DrugBank database based on the hypothesis that certain medications would decrease the risk of diabetes and evaluated the characteristic properties of drugs and their potential to bind against type-2 diabetes protein target, GSK-3β. The crucial amino acids responsible for stable interaction with ligands were found to be Lys85, Asp133 and Val135. Molecular docking analysis revealed several new classes of drugs reported to exhibit inhibitory properties against GSK-3β. Apart from crucial amino acid interactions, several other amino acids are found to be interacted with drug compounds such as Asn64, Arg141, Cys19 and Asp200, respectively. Out of 13 best drugs resulted from the analysis, top three (Venetoclax, Cobicistat and Atorvastatin) were selected based on consensus scoring using six scoring schemes such as MolDock score of Molegro, mcule, Pose&Rank, MTiAutoDock, DockThor and DSX respectively.
    Keywords: virtual screening; molecular docking; DrugBank; type-2 diabetes; GSK-3β.

Special Issue on: BIBM 2017 Integrative Data Analysis in System Biology

  • Distance Based Knowledge Retrieval through Rule Mining for Complex Biomarker Recognition from Tri-Omics Profiles   Order a copy of this article
    by Saurav Mallik, Zhongming Zhao 
    Abstract: Biomarker discovery from complex biomedical data has become an importantrntopic to unveil the significant new knowledge and disease signals for disease prevention, diagnosis and treatment during the past two decades. In general, most of the earlier methods for complex marker discovery have been proposed on the basis of a single genomic profile, and most of them utilize a single minimum support, single minimum confidence, or single minimum lift cutoffs. To overcome these general shortcomings, in this manuscript, we developed a framework for identifying complex markers using thernshortest distance based rule mining technique from the tri-omics profiles (namely, gene expression, DNA methylation and protein-protein interaction). We applied our method to a multi-omics dataset for high-grade soft tissue sarcomas. The novel markers of the sarcoma that we identified were {GRB2-, STAT3-} (i.e., both GRB2 and STAT3 as down-regulated and hyper-methylated, - denotes decreased gene activity, while + denotes increased activity), {STAT3+, TP53-, MAPK3+} (i.e., both STAT3 and MAPK3 as up-regulated and hypo-methylated & TP53 as down-regulated and hyper-methylated) andrn{STAT3+, FYN+, MAPK3+} (i.e., all the STAT3, FYN and MAPK3 as up-regulatedrnand hypo-methylated). In our comparison of our rule mining method with the existing rule mining approaches, we showed the superiority and efficiency of our method versus others, as our method generates fewer rules and lower mean of the shortest distance than the existing methods. In addition, we evaluated the markers by conducting KEGG pathway analyses as well as extensive literature search. In conclusion, our method is useful to extract complex markers from tri-omics profiles of the data for the complex disease or cellular conditions.
    Keywords: Tri-omics data; Multiple Minimum Supports/Confidences/Lifts; EmpiricalrnBayes Test; Weighted Shortest Distance; Complex marker.

  • Identification of temporal network changes in short-course gene expression from C. elegans reveals structural volatility   Order a copy of this article
    by Kathryn Cooper, Wail Hassan, Hesham Ali 
    Abstract: Many Bioinformatics algorithms attempt to extract relevant biological information from datasets obtained at specific data points. However, it is critical to identify changing genes in temporal data so that studies can focus on the dynamics of gene expression. While networks continue to play a significant role in characterizing biological relationships, most biomedical network modeling studies focus on static network-based analysis. In this study, we use a temporal, network-based approach to identify and rank genes that exhibit variation in short-course gene expression. We use a C. elegans gene correlation network obtained from mRNA expression to illustrate the value of the proposed approach, and compare the results of this method to results obtained from traditional differential gene expression analysis. We show that temporal network analysis identifies genes that are inherently different from differentially expressed genes, raising new questions about structural meaning in expression networks and how changes in expression are observed.
    Keywords: temporal network structural change; short-course gene expression; structural volatility; biological network modeling; differential gene expression.

  • Managing data provenance for bioinformatics workflows using AProvBio   Order a copy of this article
    by Rodrigo Almeida, Waldeyr Silva, Klayton Castro, Maria Emília Machado Telles Walter, Aletéia Patricia Favacho De Araújo, Sergio Lifschitz, Maristela Holanda 
    Abstract: Scientific experiments in bioinformatics are often executed as computational workflows. Data provenance involves documenting the history, and the paths of the input data, from the beginning to the end of an experiment. AProvBio is an architecture that enables the capture and storage of data provenance for bioinformatics workflows using the PROV-DM standard model. AProvBio works with three types of data provenance: prospect, retrospect, and the user-defined type. Given how graphs conveniently express PROV-DM, we have designed and implemented a simulator for storing the data provenance in a graph database system. This paper presents details and implementation aspects of our architecture, and an evaluation of AProvBio through the carrying out of two real case scenarios.
    Keywords: bioinformatics; scientific workflows; data provenance; PROV-DM; graph database.

  • Simulating genetically heterozygous genomes in the tumor tissue according to its clonal evolution history   Order a copy of this article
    by Yanshuo Chu, Mingxiang Teng, Yadong Wang 
    Abstract: Tumors contain multiple, genetically diverse subclonal populations of cells that have evolved from a single progenitor population through successive waves of expansion and selection. Currently, next-generation sequencing (NGS) and the third generation sequencing (TGS) have recently allowed us to develop algorithms to quantitatively dissect the extent of heterogeneity within a tumour, resolve cancer evolution history and identify the somatic variations and aneuploidy events with subclonal frequency. However, existing tumor NGS data has no ground truth annotation which is sufficient enough to validate all these NGS based tumor analysis algorithms. To benchmark these algorithms, a powerful tumor genome simluation tool which could simulate all the distinct subclonal genomes with diverse aneuploidy events and somatic variations according to the given tumor evolution history is in need. We provide a simulation package, Pysubsim-tree, which could simulate the tumor genomes according to their evolution history defined by the somatic variations and aneuploidy events. Pysubsim-tree is free, open source, available at:
    Keywords: Somatic variations; Cancer evolution history; Tumor heterogeneity; Tumor genome simulation.

  • Networks Regulated by Ginger towards Stomach and Small Intestine for Its Warming Interior Function   Order a copy of this article
    by Guang Zheng 
    Abstract: Ginger is widely used as both a cooking spice in east/south Asia and a traditional Chinese medicine (TCM) for its warming interior function, which is a TCM concept mainly referring to warming up the stomach and small intestine. This edible therapeutic function has been identified in the long history of TCM clinical and regimen practices. However, the underlying mechanism of warming interior is still obscure at protein regulating network level. In this study, for stomach and small intestine, 6-gingerol and 6-shaogaol, the gingers two bio-active compounds, are selected to initialize the underlying protein regulating networks. The first step is to identify the proteins targeted/regulated by ginger. These proteins were extracted from PubMed literatures and compound-protein databases. Starting with these targeted proteins, functional protein-protein interactions (FPPI) were selected to form the underlying regulating networks towards proteins expressed in stomach and small intestine. Further enrichment analysis of FPPI participating proteins (e.g. PPARG, GSK3B, GPD1, SCD, GSK3A, and IRS2) highlights five key metabolic processes on ATP, glycogen, coenzyme, glycerolipid and fatty acid. As a result, this FPPI network can be validated by PubMed literature, together with online bioinformatics tools of DAVID and KEGG. In brief, gingers warming interior function is elaborated via specified FPPI network. When activated by 6-gingerol and 6-shogaol, these five key metabolic processes can release more energy/heat to carry out the warming interior function identified in TCM towards ginger.
    Keywords: ginger; biological process; regulation network; warming interior; metabolic process.

  • A comparative study of multiclass feature selection on RNAseq and microarray data   Order a copy of this article
    by Silu Zhang, Junqing Wang, Keli Xu, Megan York, Yin-yuan Mo, Yixin Chen, Yunyun Zhou 
    Abstract: Gene expression profiles are widely used for identifying phenotype-specific biomarkers in clinical cancer research. By examining important gene features which specifically up or down-expressed in different phenotypes, clinicians can classify patients into different treatment group for precision medicine. Microarray and RNAseq are the two leading technology to measure gene expression data. However, due to the heterogeneity of two different experimental platforms, the gene signatures selected from the two platforms are different. However, there are limited studies comparatively investigated the classification performance of selected gene features from the two platforms. In this project, by using human breast cancer expression data as the example, we systematically compared the cancer subtype classification accuracies from the gene signatures selected from four popular multiclass feature selection algorithms and discussed the strength and weakness of selected genes across different experimental platforms and cohorts. Our results showed that the classification of selected genes performs best within the same platform even across the different cohorts. Our results suggested that merging the dataset belongs to the same platform will increase the statistical power and improve the prediction accuracy of the selected gene for multiclass classification analysis.
    Keywords: Systems biology; feature selection; breast cancer; cancer subtypes; machine learning; functional analysis; integration analysis; pathway.

Special Issue on: ICIBM 2018 Intelligent Biology and Medicine

  • Drug-Drug Interaction Prediction based on Co-Medication Patterns and Graph Matching   Order a copy of this article
    by Wen-Hao Chiang, Li Shen, Lang Li, Xia Ning 
    Abstract: High-order Drug-Drug Interactions (DDIs) and associated Adverse Drug Reactions (ADRs) are common, particularly for elderly people, and therefore represent a significant public health problem. Currently, high-order DDI detection primarily relies on the spontaneous reporting of ADR events. However, proactive prediction of unknown DDIs and their ADRs has indispensable benefit for protective health care. In this manuscript, the problem of predicting whether a drug combination of arbitrary orders is likely to induce adverse drug reactions is considered. The prediction problem becomes highly non-trivial when arbitrary orders of drug combinations have to been accommodated by the prospective computational methods. To solve this problem, novel kernels over drug combinations of arbitrary orders are developed within support vector machines for the prediction. Graph matching methods are used in the novel kernels to measure the similarities among drug combinations, in which drug co-medication patterns are leveraged to measure single drug similarities. The experimental results on a real-world dataset demonstrated that the new kernels achieve an area under the curve (AUC) value 0.912 for the prediction problem. The new methods with drug co-medication based single drug similarities can accurately predict whether a drug combination is likely to induce adverse drug reactions of interest.
    Keywords: drug-drug interaction prediction; drug combination similarity; co-medication; graph matching; arbitrary order; adverse drug reaction; myopathy; single drug similarity; support vector machines; binary classification problem.

  • Pessimistic Optimization For Modeling Microbial Communities With Uncertainty   Order a copy of this article
    by Meltem Apaydin, Liang Xu, Bo Zeng, Xiaoning Qian 
    Abstract: It is important to understand the complicated interactions of microbial communities who play critical roles in the ecological system, human health and diseases. Optimization-based mathematical models provide ways to analyze and obtain predictions on microbial communities. However, there are inherent model and data uncertainties from the existing knowledge and experiments about different microbial communities so that the imposed models may not exactly reflect the reality in nature. Here, addressing these challenges and aiming to have a flexible framework to model microbial communities with uncertainty, we introduce P-OptCom, an extension of an existing method OptCom, based on the ideas from the pessimistic bilevel optimization literature. This framework relies on the coordinated decision making between the single upper-(communitylevel) and multiple lower-level (multiple microorganisms or guilds) decision makers to support robust solutions to better approximate microbial community steady states even when the individual microorganisms behavior deviate from the optimum in terms of their cellular fitness criteria. We formulate the problem by considering suboptimal behavior of the individual members, and relaxing the constraints denoting the interactions within communities to obtain a model flexible enough to deal with potential uncertainties. Our study demonstrates that without experimental knowledge in advance, we are able to analyze the tradeoffs among the members of microbial communities and closely approximate the actual experimental measurements.
    Keywords: Microbial communities; Pessimistic bilevel optimization; Stoichiometric-based genome-scale metabolic modeling.

  • TopQA: A Topological Representation for Single-Model Protein Quality Assessment with Machine Learning   Order a copy of this article
    by John Smith, Matthew Conover, Natalie Stephenson, Jesse Eickholt, Dong Si, Miao Sun, Renzhi Cao 
    Abstract: Correctly predicting the complex three-dimensional structure of a protein from its sequence would allow for a superior understanding of the function of specific proteins. Thus, advancements could be made in drug discovery, nanotechnology, and many other biological fields. We propose a novel method aimed to tackle a crucial step in the protein prediction problem, assessing the quality of generated predictions. Previously, some research has focused on qualities of proteins, such as the distance between amino acids or energy functions. Our method, to the best of our knowledge, is the first to analyze the topology of the predicted structure. We confirmed our representation with a widely used visualization tool, Chimera, and found that it provided accurate information regarding the location of the protein\'s backbone. Using this information, we implemented a novel algorithm to process this information based on Convolutional Neural Network (CNN) to predict the GDT\\_TS score (a metric for assessing the quality of a model) for given protein models. Our method has shown promising results, achieving an overall correlation of 0.41 on testing dataset of CASP12. Future work will aim to implement additional features into our representation.The software is freely available at GitHub:
    Keywords: Convolutional Neural Network; protein single-model quality assessment; topological representation.

  • A Hidden Markov Model-based approach to reconstructing double minute chromosome amplicons   Order a copy of this article
    by Ruslan Mardugalliamov, Kamal Al Nasr, Matthew Hayes 
    Abstract: Double minute chromosomes (DMs) are circular fragments of extrachromosomal DNA. They are a mechanism for extreme gene amplification in the cells of some malignant tumors. Their existence strongly correlates with malignant tumor cell behavior and drug resistance. Locating DMs is important for informing precision therapy to cancer treatment. Furthermore, accurate detection of double minutes requires precise reconstruction of their amplicons, which are the highly-amplified gene-carrying contiguous segments that adjoin to form DMs. This work presents AmpliconFinder -- a Hidden-Markov Model-based approach to detect DM amplicons. To assess its efficacy, AmpliconFinder was used to augment an earlier framework for DM detection (DMFinder), thus improving its robustness to noisy sequence data, and thus improving its sensitivity to detect DMs. Experiments on simulated genomic data have shown that augmenting DMFinder with AmpliconFinder significantly increased the sensitivity of DMFinder on these data. Moreover, DMFinder with AmpliconFinder found all previously reported DMs in three pediatric medulloblastoma datasets, whereas the original DMFinder framework found none.
    Keywords: next generation sequencing; double minute chromosome; double minute; structural variation; amplicon; tumor genome reconstruction;.

  • High scoring segment selection for pairwise whole genome sequence alignment with the maximum scoring subsequence and GPUs   Order a copy of this article
    by Abdulrhman Aljouie, Ling Zhong, Usman Roshan 
    Abstract: Whole genome alignment programs use exact string matching with hash tables to quickly identify high scoring fragments between a query and target sequence around which a full alignment is then built. In a recent large-scale comparison of alignment programs called Alignathon it was discovered that while evolutionary similar genomes were easy to align, divergent genomes still posed a challenge to existing methods. As a first step to fill this gap we explore the use of more exact methods to identify high scoring fragments which we then pass on to a standard pipeline. We identify such segments between two whole genome sequences with the maximum scoring subsequence instead of hash tables. This is computationally extremely expensive and so we employ the parallelism of a Graphics Processing Unit to speed it up. We split the query genome into several fragments and determine its best match to the target with a previously published GPU algorithm for aligning short reads to a genome sequence. We then pass such high scoring fragments on to the LASTZ program which extends the fragment to obtain a more complete alignment. Upon evaluation on simulated data, where the true alignment is known, we see that this method gives an average of at least 20% higher accuracy than the alignment given by LASTZ at the expense of a few hours of additional runtime. We make our source code freely available at url{}.rn
    Keywords: genome alignment; anchor selection; LASTZ; GPU.

  • Brain-wide structural connectivity alterations under the control of Alzheimer risk genes   Order a copy of this article
    by Jingwen Yan, Vinesh Raja V, Zhi Huang, Enrico Amico, Kwangsik Nho, Shiaofen Fang, Olaf Sporns, Yu-chien Wu, Andrew Saykin, Joaquin Goni, Li Shen 
    Abstract: Background: Alzheimer's disease is the most common form of brain dementiarncharacterized by gradual loss of memory followed by further deterioration of otherrncognitive function. Large-scale genome-wide association studies have identi edrnand validated more than 20 AD risk genes. However, how these genes are relatedrnto the brain-wide breakdown of structural connectivity in AD patients remainsrnunknown.rnMethods: We used the genotype and di usion tensor imaging (DTI) data in thernAlzheimer's Disease Neuroimaging Initiative (ADNI) database. After constructingrnthe brain network for each subject, we extracted three types of link measures,rnincluding ber anisotropy, ber length and density. We then performed a targetedrngenetic association analysis of brain-wide connectivity measures using generalrnlinear regression models. Age at scan and gender were included in the regressionrnmodel as covariates. For fair comparison of the genetic e ect on di erentrnmeasures, ber anisotropy, ber length and density were all normalized withrnmean as 0 and standard deviation as one.We aim to discover the abnormalrnbrain-wide network alterations under the control of 34 AD risk SNPs identi ed inrnprevious large-scale genome-wide association studies.rnResults: After enforcing the stringent Bonferroni correction, rs10498633 inrnSLC24A4 were found to signi cantly associated with anisotropy, total numberrnand length of bers, including some connecting brain hemispheres. rs429358 inrntop AD risk gene APOE shows nominal signi cance of association with therndensity of fibers between Subcortical and Cerebellum (p=2.71e-6).
    Keywords: brain connectivity; imaging genetics association; Alzheimer's disease.