Forthcoming articles


International Journal of Computational Biology and Drug Design


These articles have been peer-reviewed and accepted for publication in IJCBDD, but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.


Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.


Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.


Articles marked with this Open Access icon are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.


Register for our alerting service, which notifies you by email when new issues of IJCBDD are published online.


We also offer RSS feeds which provide timely updates of tables of contents, newly published articles and calls for papers.


International Journal of Computational Biology and Drug Design (31 papers in press)


Regular Issues


  • Data Acquisition and Electrical Instrumentation Engineering Modelling for Intelligent Learning and Recognition
    by Jun Qin, Yuhao Jiang 

  • Development of interactive computer learning program for genetics and molecular biology applications
    by Xiaoli Yang, Bin Chen, Yifan Cai, Charles Tseng 

  • Human Caveolin-1 a potent inhibitor for prostate cancer therapy: a computational approach   Order a copy of this article
    by Uzma Khanam, Balwant Kishan Malik, Puniti Mathur, Bhawna Rathi 
    Abstract: Caveolin-1 (Cav-1) is 22 kDa caveolae protein, acts as a scaffold within caveolar membranes. It interacts with alpha subunits of G-protein and thereby regulates their activity. Earlier studies reported elevated or up-regulated levels of caveolin-1 in the serum of prostate cancer patients. Secreted Cav-1 promotes angiogenesis, cell proliferation and anti-apoptotic activities in prostate cancer patients. Cav-1 upregulation is mainly related to prostate cancer metastasis. Keeping above facts in view, the present study was designed to explore Cav-1 as a target for prostate cancer therapy using computational approach. Molecular docking, structural base molecular modelling and molecular dynamics simulations were performed to investigate Cav-1 inhibitors. A predictive model was generated and validated to establish a stable structure. ZINC database of biogenic compounds was used for induced fit docking (IFD) and high throughput virtual screening. The H-bond interactions of the compounds with active site residues of Cav-1 was estimated by IFD and 100 ns long molecular dynamic simulations. The reported compounds showed significant binding and thus can be considered as potent therapeutic inhibitors of Cav-1. This study provides a valuable insight into biochemical interactions of Cav-1 for therapeutic applications and warrants for experimental validation of the predicted active(s).
    Keywords: Molecular dynamics simulation; virtual screening; molecular docking; prostate cancer; caveolin-1; induced fit docking; protein-protein interaction network.

  • An in silico approach to design a potential drug for Haemophilia A   Order a copy of this article
    by Srishti Munjal, Gaurav Jaisawal, Navodit Goel, Udai Pratap Singh, Ajay Vishwakrma, Abhinav Srivastava 
    Abstract: Haemophilia A has been known as a disease since the late 20th century but till date, there has not been developed a cure for it. Treatments that temporarily relieve bleeding episodes include new factor replacement therapies with longer half-lives delaying the frequency of blood transfusions. There is a need to devise a new drug for the same. in silico drug designing comes as a powerful tool in designing a molecule to be used as drug in comparatively less time. In this study a new drug molecule was designed using Bioinformatic tools. The causative gene was found out to be X-linked F8 and the corresponding protein as coagulation factor VIII. Material and Methods: Target proteins were identified from protein databases and their structures were observed. Cavities in the protein were determined using SPDBV (Swiss PDB Viewer). Ligands and its isomers, following the Lipinskis rule of five, were prepared through Molinspiration. Docking between the ligands and target proteins were performed using Molegro Virtual Docker. Results: Thirteen proteins were selected and twelve ligands were prepared. Docking studies were performed and two criteria were compared MolDock score and hydrogen bond score. The most appropriate values as -836.722 for MolDock score and -55.02 for H-Bond score were obtained with 1SDD and ligand 1.
    Keywords: Haemophilia A; Factor VIII; X-linked disease; drug designing.

  • De novo Drug Design, Pharmacophore Search and Molecular Docking for Inhibitors to treat TB and HIV co-infection   Order a copy of this article
    by Satheeshkumar Sellamuthu, Ashok Kumar, Sushil Singh 
    Abstract: Novel molecules were designed as possible inhibitors of ATP synthase through de novo drug design, but were not drug-like molecules. Hence, ZINC database was searched for drug-like molecules from the common pharmacophore of the designed molecules. A total of 472 hits were obtained, among them, ZINC39552534, ZINC39371747, and ZINC38959526 produced better docking than the standard drug Bedaquiline. The vulnerability of TB and HIV co-infection has necessitated the search for inhibitors effective against both the diseases. Hence, the hits obtained were further screened for possible interaction with HIV reverse transcriptase. ZINC63941671, ZINC05858010, and ZINC05857787 were found better over the standard drug Rilpivirine, but their interaction was least against ATP synthase. Further, ZINC38959526 (lead against ATP synthase) and ZINC05858010 (lead against reverse transcriptase) share some common chemical features and based on this, new hybrid molecules were designed to inhibit both the targets. The possibility of hERG toxicity was also checked to eliminate unwanted cardiotoxicity.
    Keywords: ATP synthase inhibitors; De novo drug design; HIV; hERG toxicity; Molecular docking; Pharmacophore search; Reverse transcriptase; Tuberculosis; ZINC database.

  • Improving the nerve regeneration ability by inhibiting the orchestral activity of the myelin associated repair inhibitors: An In Silico Approach   Order a copy of this article
    by Sumaira Kanwal, Shazia Perveen 
    Abstract: Spinal cord injury (SCI) causes severe neurological modifications that significantly interrupt the physical, emotional and economical stability of affected individuals. Unluckily, the repairing ability of the central nervous system is very restricted because of reduced intrinsic growth capacity and non-permissive environment for axonal elongation. After injury, axonal regeneration of the adult central nervous system (CNS) is inhibited by myelin-derived growth-suppressing proteins. On contrary the regeneration capability of axons in peripheral nervous system is much better. These axonal growth inhibitory proteins are mediated via activation of Rho, a small GTP-binding protein.Reticulen4, myelin associated glycoprotein and Oligodendrocyte-myelin glycoprotein are the most influential axonal regeneration inhibitors. In the present study, a hybrid approach of comparative modeling and molecular docking followed by inhibitor identification and structure modeling was employed. Docking analysis showed that the two important drugs which are widely used have the potential to block the Rho-Rock pathways. Here, we report inhibitors which showed maximum binding affinity for the three most important axonal regeneration inhibitors. These two compounds at three stages and can block the activity of the inhibitors of axon regeneration. Three step approaches can be used to defeat the axonal neuropathies that especially in the CMT disease. However further studies are required to find the applications of these drugs.
    Keywords: Axonapathy; CMT2; NOGO; Rho-Rock pathways; Nonsteroidal anti-inflammatory drugs,Neurological disorder; Spinal Cord Injury; Multiple Sclerosis.

  • Deep Convolutional Neural Network for Laser Forward Scattering Image Classification in Microbial Source Tracking   Order a copy of this article
    by Bin Chen 
    Abstract: The colony-based laser scatter imaging for microbial source tracking heavily relies on the power of optical scattering image classification. While carefully handcraft feature extraction achieved excellent results for the colonies with certain sizes for optimal classification results, the classification accuracy drops quickly for smaller or larger colonies outside of the colony size range. In this study, a deep convolutional neural network was implemented for laser scattering image feature extraction and classification. The results show that the deep learning classification method clearly outperforms the traditional clustering methods with high accuracy and consistency for host species with a wide range of colony sizes. It also provides comparable accuracy for the colonies with the optimal sizes.
    Keywords: deep learning; convolutional neural network; microbial source tracking; laser imaging.

  • Computational prediction of binding of monocrotophos and its analogues on Human acetylcholine esterase, oxyhaemoglobin and IgE antibody   Order a copy of this article
    by Nathiya Soundararajan, Durga Mohan, Devasena Thiyagarajan 
    Abstract: In the present study, computational approach has been employed to study the interactions of human acetylcholine esterase, human oxyhaemoglobin and human high-affinity IgE receptor with an organophosphate pesticides and the comparative binding affinity, interacting residues of protein, H-bond distance and fitness score has been evaluated using GOLD software. Monocrotophos and its analogs bind to AchE with the highest fitness score. The analog RPR-II binds to the receptor with a highest fitness score: 42.17 when compared to RPR-V (fitness score: 40.62) and monocrotophos (fitness score: 35.25). Monocrotophos, RPR-II and RPR-V interact with oxyhaemoglobin with a fitness score of about 17.68, 20.16 and 24.62 respectively. Monocrotophos, RPR-II and RPR-V interact with human high-affinity IgE receptor with a fitness score of about 18.29, 19.05 and 22.57 respectively. The above results indicate that RPR series are highly toxic than monocrotophos, hence there is need for complete evaluation of the toxicological effect of new pesticides.
    Keywords: Monocrotophos; RPR series; Acetylcholine esterase; Oxyhaemoglobin; IgE receptor; Toxicology.

  • The unique QA domain of Runx2 causes conformational change in the Runt DNA binding domain which may result in alteration in its function   Order a copy of this article
    by Arpita Devi 
    Abstract: Runt-related transcription factors (RUNX) are a family of proteins expressed by RUNX genes. In mammals, there are three members in this family- RUNX1, RUNX2 and RUNX3. There is high sequence similarity in the three members. However, there is a presence of QA domain in the N-terminal of Runx2. The structural aspect of this domain has not been elucidated till now. Here, we model the structures RUNX1, RUNX2 and RUNX2 without the QA domain (RUNX2Δqa)from its N-terminal to DNA binding domain. It has been found that there is a significant difference in structure of RUNX2 and RUNX2Δqa. The structure of RUNX2Δqa resembles that of RUNX1. Also, RUNX2Δqa seems to bind to the consensus DNA sequence of RUNX1 with higher affinity than that of RUNX2. The presence of QA domain also decreases the affinity of Runx2 towards CBFbeta. Thus, we find that the QA domain structurally and functionally diverts RUNX2 from that of RUNX1.
    Keywords: Runx; Docking; Molecular dynamics simulation; QA domain.

  • Exploration of Cyclooxygenase-1 Binding modes of some Chiral Anti-inflammatory Drugs using Molecular Docking and Dynamic Simulations   Order a copy of this article
    by Meriem Meyar, Samira Feddal, Zohra Bouakouk, Safia Kellou-Tairi 
    Abstract: The profens represent an important class of chiral anti-inflammatory drugs. They are often marketed as racemic mixtures, but one of their enantiomers R or S can be inactive or toxic. With the aim of evaluating the anti-inflammatory activity of each enantiomer, it would be useful to first theoretically predict the enantiomer responsible for this activity. For that, three well known profens: ibuprofen, flurbiprofen, naproxen and some of their derivatives have been selected from the literature and were studied through docking and molecular dynamic (MD) simulations. Analysis of the recognition modes, through interactions with relevant residues of the cyclooxygenase-1(COX-1), can predict and explain which enantiomer is the most active. MD study highlights that water molecules play an important role in ligand-receptor interactions. Also, our combined study showed the preference of the profen's S-enantiomer towards the COX-1 active site in contrast to R-enantiomer.
    Keywords: COX-1; Profens; Chiral NSAIDs; Molecular docking; MD simulations; Binding Modes.

  • A Novel Approach for Identification of possible GSK-3 inhibitors using computational virtual screening analysis of Drugs   Order a copy of this article
    Abstract: GSK-3 has a prominent role in glucose uptake and was investigated using more specific, ATP-competitive GSK-3 inhibitors. This multifunctional kinase apart from the ability to phosphorylate glycogen synthase and regulate glucose metabolism was subsequently found to be a critical component in numerous cellular functions including regulation of different cell signaling, cell division, differentiation, proliferation and growth as well as apoptosis. In this work, we report molecular docking analysis of 2035 approved drugs from DrugBank database based on the hypothesis that certain medications would decrease the risk of diabetes and evaluated the characteristic properties of drugs and their potential to bind against type-2 diabetes protein target, GSK-3β. The crucial amino acids responsible for stable interaction with ligands were found to be Lys85, Asp133 and Val135. Molecular docking analysis revealed several new classes of drugs reported to exhibit inhibitory properties against GSK-3β. Apart from crucial amino acid interactions, several other amino acids are found to be interacted with drug compounds such as Asn64, Arg141, Cys19 and Asp200, respectively. Out of 13 best drugs resulted from the analysis, top three (Venetoclax, Cobicistat and Atorvastatin) were selected based on consensus scoring using six scoring schemes such as MolDock score of Molegro, mcule, Pose&Rank, MTiAutoDock, DockThor and DSX respectively.
    Keywords: virtual screening; molecular docking; DrugBank; type-2 diabetes; GSK-3β.

  • Asymmetric glycan recognition among alpha and beta monomers of Spatholobus parviflorus lectin: An insilico insight   Order a copy of this article
    by Surya Sukumaran, Haridas M 
    Abstract: Protein-carbohydrate recognition, an important form of inter-cell communication, plays promising role in several biological events. Extensive studies were already done in this area of protein-carbohydrate recognition using legume lectin molecules for drug targeting. It comprises huge processes, and many challenges still have to be solved. In this study, an attempt was made to reveal the interaction homogeneity of various carbohydrate residues and their comparative analysis of binding mode towards alpha and beta monomers of Spatholobus parviflorus lectin (SPL) as a model. An array of sugars based on their structural and functional roles in information coding were selected for virtual screening. Based on the glidescore, 20 sugars were screened and the extra precision docking exercises were carried out to explore the variability and stability in their binding affinity towards the SPL monomers. Among the studied sugars, raffinose exhibited highest affinity towards the alpha and beta monomers with glide scores of -11.43 and -10.65 kcal/mol respectively. When compared to alpha, the beta monomer showed higher glide score by favoring stable interactions. The low binding affinity of alpha subunit is featured by the extra cleft seen in close proximity of the known sugar binding pocket of alpha subunit for accommodating structurally miniatured sugars. These alterations exhibited by alpha and beta monomers may be due to its asymmetry in the pairing of α and β subunits. This prediction, deciphered the in silico binding report of sugars with SPL, along with their inconsistency in binding with monomeric units, may contribute towards more specific and precise drug targeting.
    Keywords: SPL; monomers; sugars.

  • Functional Module Extraction by Ensembling the Ensembles of Selective Module Detectors   Order a copy of this article
    by Monica Jha, Pietro Guzzi, Pierangelo Veltri, Swarup Roy 
    Abstract: A group of functionally related genes constitutes a functional module taking part in similar biological activities. Such modules can be employed for interpretation of biological and cellular processes or their involvement in associated diseases. Detection of such modules from co-expression network is a difficult task, different methods have been employed to date for detecting such modules, such as clustering, biclustering and network-based techniques. In this work, we discuss and compare selective module finding methods and their ensemble. We use RNA Sequence (RNASeq) data to evaluate the performances of few network based module finding techniques. It could be observed that ensemble technique increases the accuracy and stability.
    Keywords: Next generation Sequencing;RNA Seq; Ensemble; Functional Module; Gene Ontology; Pathway analysis.

  • The potential inhibitory role of teucrolivins against human Dipeptidyl peptidase 4 protein as a promising strategy for treatment of type 2 diabetes   Order a copy of this article
    by Ateeq Al-Zahrani 
    Abstract: Inhibition of disease-related proteins by natural inhibitors revealed its efficiency and became a promising step in drug discovery. With hundreds of advanced web servers and software, it is possible to predict potential drugtarget in order to reduce laboratory cost and time. In the current study, computational simulations were performed to investigate the possible role of teucrolivins, isolated from Teucrium oliverianum plant, as natural inhibitors against dipeptidyl peptidase 4 protein (DP4) which is related to type 2 diabetes. The docking results revealed that teucrolivins A, B, D and E showed higher binding affinities compared to the native inhibitor PF2. Teucrolivin D exhibited the highest interactions among teucrolivins with the minimum binding energy of -144.16. Sitagliptin, vildagliptin and omarigliptin are antidiabetic drugs for inhibition of dipeptidyl peptidase 4 protein. These drugs were used as negative controls. They gave minimum binding energy of -120.19, -103.1 and -104.69 respectively, and showed a lower binding affinity compared to teucrolivin D. Evaluation of ADMET confirmed the capability of teucrolivin D as an effective inhibitor against DP4 and its promising potential as an antidiabetic drug. This study highlights the medical importance of teucrolivins and the possibility of using this class of inhibitors for the treatment of type 2 diabetes.
    Keywords: teucrolivins; Teucrium oliverianum; dipeptidyl peptidase 4; antidiabetic inhibitors; molecular docking.

    by Kalirajan Rajagopal, Pandiselvi A, Gowramma B 
    Abstract: 9-aminoacridines play an important role in the field of antitumor DNA-intercalating agents, due to their antiproliferative properties. Several anticancer agents with 9-anilinoacridines such as amascrine, and nitracrine have been developed. To get insight of intermolecular interactions, the molecular docking studies are performed at active site of HER2. Aim: In the present study, for identification of potential ligands of isoxazole substituted 9-amino acridines as selective HER2 inhibitors (PDB id- 3PP0) targeting breast cancer by using Schrodinger suit-2016-2, Maestro 9.6 version. Molecular docking targeted against HER2 by Glide module, insilco ADMET screening also performed by qikprop module and free binding energy of compounds was calculated by Prime-MMGBSA module. The binding affinity of the designed molecules towards HER2 (PDB id- 3PP0) was selected on the basis of GLIDE score and interaction patterns. Many compounds showed strong hydrophobic interactions and hydrogen bonding interactions and other parameters with amino acid residues and also explain their potency to inhibit HER2 (3PP0). The isoxazole substituted 9-amino acridine derivatives 1a- 1x have good binding affinity with Glide score in the range of -6.6 to -9.7 when compared with the standard ledacrine (-6.3) and tamoxifen (-3.7). The ADMET screening of the designed molecules have almost all the ADMET properties of the compounds are within the recommended values. MM-GBSA binding results of most potent inhibitor displayed stable and favourable. The results reveals that, this study provides evidence for consideration of valuable ligands in isoxazole substituted 9-amino acridine derivatives as potential HER2 inhibitor and the compounds, 1o,f,n,d,m,w with good Glide score may produce significant anti breast cancer activity for further in vitro and in vivo investigations may prove their therapeutic potential.
    Keywords: Acridine; Isoxazole; docking studies; Insilico ADMET screening; MM-GBSA.

  • Identifying drug-like Inhibitors of Mycobacterium tuberculosis H37Rv Seryl tRNA Synthetase based on bioassay dataset: Homology modelling, docking and molecular dynamics simulation   Order a copy of this article
    by ADARSH V. K., Santhiagu Arockiasamy 
    Abstract: Resistance to existing drugs of tuberculosis bacteria demands an immediate requirement to develop effective new chemical entities acting on emerging targets. Seryl-tRNA synthetase (SerRS) is essential for the viability of Mycobacterium tuberculosis (MTB) due to its crucial role in protein biosynthesis. In this study, we have attempted to develop the tertiary structure of SerRS through homology modelling and to elucidate the active site interactions of inhibitor compounds aided by docking. Homology modelling using PDB ID: 2DQ3: A chain as template and validation of the model was carried out with Modeller V9.13 and SAVES online server respectively. About 1248 compounds from a putative kinase compound library of PubChem database found active in whole cell bioassay (AID2842) on MTB - H37Rv was used in docking studies using AutoDock. Out of the tested molecules, nine showed docking scores ≤ 10 kcal/mol with good drug-like properties were further subjected to molecular dynamics (MD) simulations and found three out of the nine compounds have stable interaction with the enzyme. We believe these molecules with the knowledge about their docked poses, interaction patterns, and scaffolds may provide hinds for further target specific screening and design.
    Keywords: drug design; homology modelling; Modeller; AutoDock; multidrug-resistant Mycobacterium tuberculosis; MDR-TB; Seryl-tRNA synthetase; SerRS; PubChem; molecular docking; molecular dynamics.

  • Boosting Gene Expression Clustering with System-Wide Biological Information: A Robust Autoencoder Approach   Order a copy of this article
    by Hongzhu Cui, Chong Zhou, Xinyu Dai, Yuting Liang, Randy Paffenroth, Dmitry Korkin 
    Abstract: Gene expression analysis provides genome-wide insights into the transcriptional activity of a cell. One of the first computational steps in exploration and analysis of the gene expression data is clustering. With a number of standard clustering methods routinely used, most of the methods do not take prior biological information into account. Here, we propose a new approach for gene expression clustering analysis. The approach benefits from a new deep learning architecture, Robust Autoencoder, which provides a more accurate high-level representation of the feature sets, and from incorporating prior system-wide biological information into the clustering process. We tested our approach on two gene expression datasets and compared the performance with two widely used clustering methods, hierarchical clustering and k-means, and with a recent deep learning clustering approach. Our approach outperformed all other clustering methods on the labeled yeast gene expression dataset. Furthermore, we showed that it is better in identifying the functionally common clusters than k-means on the unlabeled human gene expression dataset. The results demonstrate that our new deep learning architecture can generalize well the specific properties of gene expression profiles. Furthermore, the results confirm our hypothesis that the prior biological network knowledge is helpful in the gene expression clustering.
    Keywords: gene expression; protein-protein interactions; clustering; deep learning.

  • Mathematical modelling of hepatitis C virus dynamics response to therapeutic effects of interferon and ribavirin   Order a copy of this article
    by Jean Marie Ntaganda 
    Abstract: This paper aims at designing a two compartmental mathematical model for determining the response of protein (Interferon) and drug (Ribarivin) for a patient who is suffering from hepatitis C virus (HVC). The stability of developed mathematical model is established. Using inverse techniques, model parameters and functions are identified. To test efficiency and response to interferon and ribavirin as HCV treatment, the validation of the mathematical model is achieved by considering a patient on treatment during 12 months. The results obtained are rather satisfactory since model parameters vary around their corresponding value that is equilibrium values for healthy subjects. Furthermore, the reaction of the disease to treatment can be modeled and a feedback can be approximated by the solution of an optimal control problem. The increasing necessity to interpret the meaning of measurable variables such as interferon and ribavirin under both physiological and pathological conditions for a patient has imposed the need for relatively simple models that should be able to describe as accurately as possible the mechanical behavior of the disease.
    Keywords: HVC; Treatment; Interferon; Ribavirin; Parameters identification; Stability; Equilibrium value; Healthy subjects; Numerical simulation.

Special Issue on: ICIBM 2018 Intelligent Biology and Medicine

  • Drug-Drug Interaction Prediction based on Co-Medication Patterns and Graph Matching   Order a copy of this article
    by Wen-Hao Chiang, Li Shen, Lang Li, Xia Ning 
    Abstract: High-order Drug-Drug Interactions (DDIs) and associated Adverse Drug Reactions (ADRs) are common, particularly for elderly people, and therefore represent a significant public health problem. Currently, high-order DDI detection primarily relies on the spontaneous reporting of ADR events. However, proactive prediction of unknown DDIs and their ADRs has indispensable benefit for protective health care. In this manuscript, the problem of predicting whether a drug combination of arbitrary orders is likely to induce adverse drug reactions is considered. The prediction problem becomes highly non-trivial when arbitrary orders of drug combinations have to been accommodated by the prospective computational methods. To solve this problem, novel kernels over drug combinations of arbitrary orders are developed within support vector machines for the prediction. Graph matching methods are used in the novel kernels to measure the similarities among drug combinations, in which drug co-medication patterns are leveraged to measure single drug similarities. The experimental results on a real-world dataset demonstrated that the new kernels achieve an area under the curve (AUC) value 0.912 for the prediction problem. The new methods with drug co-medication based single drug similarities can accurately predict whether a drug combination is likely to induce adverse drug reactions of interest.
    Keywords: drug-drug interaction prediction; drug combination similarity; co-medication; graph matching; arbitrary order; adverse drug reaction; myopathy; single drug similarity; support vector machines; binary classification problem.

  • Pessimistic Optimization For Modeling Microbial Communities With Uncertainty   Order a copy of this article
    by Meltem Apaydin, Liang Xu, Bo Zeng, Xiaoning Qian 
    Abstract: It is important to understand the complicated interactions of microbial communities who play critical roles in the ecological system, human health and diseases. Optimization-based mathematical models provide ways to analyze and obtain predictions on microbial communities. However, there are inherent model and data uncertainties from the existing knowledge and experiments about different microbial communities so that the imposed models may not exactly reflect the reality in nature. Here, addressing these challenges and aiming to have a flexible framework to model microbial communities with uncertainty, we introduce P-OptCom, an extension of an existing method OptCom, based on the ideas from the pessimistic bilevel optimization literature. This framework relies on the coordinated decision making between the single upper-(communitylevel) and multiple lower-level (multiple microorganisms or guilds) decision makers to support robust solutions to better approximate microbial community steady states even when the individual microorganisms behavior deviate from the optimum in terms of their cellular fitness criteria. We formulate the problem by considering suboptimal behavior of the individual members, and relaxing the constraints denoting the interactions within communities to obtain a model flexible enough to deal with potential uncertainties. Our study demonstrates that without experimental knowledge in advance, we are able to analyze the tradeoffs among the members of microbial communities and closely approximate the actual experimental measurements.
    Keywords: Microbial communities; Pessimistic bilevel optimization; Stoichiometric-based genome-scale metabolic modeling.

  • TopQA: A Topological Representation for Single-Model Protein Quality Assessment with Machine Learning   Order a copy of this article
    by John Smith, Matthew Conover, Natalie Stephenson, Jesse Eickholt, Dong Si, Miao Sun, Renzhi Cao 
    Abstract: Correctly predicting the complex three-dimensional structure of a protein from its sequence would allow for a superior understanding of the function of specific proteins. Thus, advancements could be made in drug discovery, nanotechnology, and many other biological fields. We propose a novel method aimed to tackle a crucial step in the protein prediction problem, assessing the quality of generated predictions. Previously, some research has focused on qualities of proteins, such as the distance between amino acids or energy functions. Our method, to the best of our knowledge, is the first to analyze the topology of the predicted structure. We confirmed our representation with a widely used visualization tool, Chimera, and found that it provided accurate information regarding the location of the protein\'s backbone. Using this information, we implemented a novel algorithm to process this information based on Convolutional Neural Network (CNN) to predict the GDT\\_TS score (a metric for assessing the quality of a model) for given protein models. Our method has shown promising results, achieving an overall correlation of 0.41 on testing dataset of CASP12. Future work will aim to implement additional features into our representation.The software is freely available at GitHub:
    Keywords: Convolutional Neural Network; protein single-model quality assessment; topological representation.

  • A Hidden Markov Model-based approach to reconstructing double minute chromosome amplicons   Order a copy of this article
    by Ruslan Mardugalliamov, Kamal Al Nasr, Matthew Hayes 
    Abstract: Double minute chromosomes (DMs) are circular fragments of extrachromosomal DNA. They are a mechanism for extreme gene amplification in the cells of some malignant tumors. Their existence strongly correlates with malignant tumor cell behavior and drug resistance. Locating DMs is important for informing precision therapy to cancer treatment. Furthermore, accurate detection of double minutes requires precise reconstruction of their amplicons, which are the highly-amplified gene-carrying contiguous segments that adjoin to form DMs. This work presents AmpliconFinder -- a Hidden-Markov Model-based approach to detect DM amplicons. To assess its efficacy, AmpliconFinder was used to augment an earlier framework for DM detection (DMFinder), thus improving its robustness to noisy sequence data, and thus improving its sensitivity to detect DMs. Experiments on simulated genomic data have shown that augmenting DMFinder with AmpliconFinder significantly increased the sensitivity of DMFinder on these data. Moreover, DMFinder with AmpliconFinder found all previously reported DMs in three pediatric medulloblastoma datasets, whereas the original DMFinder framework found none.
    Keywords: next generation sequencing; double minute chromosome; double minute; structural variation; amplicon; tumor genome reconstruction;.

  • High scoring segment selection for pairwise whole genome sequence alignment with the maximum scoring subsequence and GPUs   Order a copy of this article
    by Abdulrhman Aljouie, Ling Zhong, Usman Roshan 
    Abstract: Whole genome alignment programs use exact string matching with hash tables to quickly identify high scoring fragments between a query and target sequence around which a full alignment is then built. In a recent large-scale comparison of alignment programs called Alignathon it was discovered that while evolutionary similar genomes were easy to align, divergent genomes still posed a challenge to existing methods. As a first step to fill this gap we explore the use of more exact methods to identify high scoring fragments which we then pass on to a standard pipeline. We identify such segments between two whole genome sequences with the maximum scoring subsequence instead of hash tables. This is computationally extremely expensive and so we employ the parallelism of a Graphics Processing Unit to speed it up. We split the query genome into several fragments and determine its best match to the target with a previously published GPU algorithm for aligning short reads to a genome sequence. We then pass such high scoring fragments on to the LASTZ program which extends the fragment to obtain a more complete alignment. Upon evaluation on simulated data, where the true alignment is known, we see that this method gives an average of at least 20% higher accuracy than the alignment given by LASTZ at the expense of a few hours of additional runtime. We make our source code freely available at url{}.rn
    Keywords: genome alignment; anchor selection; LASTZ; GPU.

  • Brain-wide structural connectivity alterations under the control of Alzheimer risk genes   Order a copy of this article
    by Jingwen Yan, Vinesh Raja V, Zhi Huang, Enrico Amico, Kwangsik Nho, Shiaofen Fang, Olaf Sporns, Yu-chien Wu, Andrew Saykin, Joaquin Goni, Li Shen 
    Abstract: Background: Alzheimer's disease is the most common form of brain dementiarncharacterized by gradual loss of memory followed by further deterioration of otherrncognitive function. Large-scale genome-wide association studies have identi edrnand validated more than 20 AD risk genes. However, how these genes are relatedrnto the brain-wide breakdown of structural connectivity in AD patients remainsrnunknown.rnMethods: We used the genotype and di usion tensor imaging (DTI) data in thernAlzheimer's Disease Neuroimaging Initiative (ADNI) database. After constructingrnthe brain network for each subject, we extracted three types of link measures,rnincluding ber anisotropy, ber length and density. We then performed a targetedrngenetic association analysis of brain-wide connectivity measures using generalrnlinear regression models. Age at scan and gender were included in the regressionrnmodel as covariates. For fair comparison of the genetic e ect on di erentrnmeasures, ber anisotropy, ber length and density were all normalized withrnmean as 0 and standard deviation as one.We aim to discover the abnormalrnbrain-wide network alterations under the control of 34 AD risk SNPs identi ed inrnprevious large-scale genome-wide association studies.rnResults: After enforcing the stringent Bonferroni correction, rs10498633 inrnSLC24A4 were found to signi cantly associated with anisotropy, total numberrnand length of bers, including some connecting brain hemispheres. rs429358 inrntop AD risk gene APOE shows nominal signi cance of association with therndensity of fibers between Subcortical and Cerebellum (p=2.71e-6).
    Keywords: brain connectivity; imaging genetics association; Alzheimer's disease.

  • A De-Novo drug design and ADMET study to design small molecule stabilizers targeting mutant (V210I) human prion protein against familial Creutzfeldt-Jakob disease (fCJD).   Order a copy of this article
    by Rafat Alam, G.M. Sayedur Rahman, Nahid Hasan, Abu Sayeed Chowdhury 
    Abstract: The purpose of our project was to computationally design small molecule stabilizers targeting mutant (V210I) human prion protein (HuPrP) using combined De-novo, pharmacophore, molecular docking and ADMET study to cure familial Creutzfeldt-Jakob disease (fCJD). Successful Development of drugs against familial CJD might provide valuable insight for design and development of new antiprion drugs and understand their mechanisms. We collected the target protein structure from Protein Data Bank (RCSB PDB). After that, we minimized the energy using Yasara energy minimization webserver and validated the structure using RAMPAGE webserver. We used KV Finder, a plug-in of Pymol to identify the drug binding pockets in the target protein. The pocket information was used for de-novo ligand design using the e-LEA3D webserver. Those ligands were used to generate a pharmacophore using LigandScout for the selected pocket. The designed pharmacophore was implied to the webserver Pharmit for virtual screening of small molecules from Pubchem database and the screened small molecules were docked into the target pocket of the protein using the software Autodock Vina. Best 5 molecules were identified with binding affinities of 7.7, 7.2, 7.2, 7.1 and 7.1 kcal mol-1 respectively. Finally, we analyzed the ADMET properties of the best five ligands using the webserver SwissADME. All the five small molecules were proven to be the ideal candidates for further drug development.
    Keywords: ADMET; de-novo drug design; Docking; Prion; PDB; Pharmacophore.

Special Issue on: BIBM 2017 Integrative Data Analysis in System Biology

  • Distance Based Knowledge Retrieval through Rule Mining for Complex Biomarker Recognition from Tri-Omics Profiles   Order a copy of this article
    by Saurav Mallik, Zhongming Zhao 
    Abstract: Biomarker discovery from complex biomedical data has become an importantrntopic to unveil the significant new knowledge and disease signals for disease prevention, diagnosis and treatment during the past two decades. In general, most of the earlier methods for complex marker discovery have been proposed on the basis of a single genomic profile, and most of them utilize a single minimum support, single minimum confidence, or single minimum lift cutoffs. To overcome these general shortcomings, in this manuscript, we developed a framework for identifying complex markers using thernshortest distance based rule mining technique from the tri-omics profiles (namely, gene expression, DNA methylation and protein-protein interaction). We applied our method to a multi-omics dataset for high-grade soft tissue sarcomas. The novel markers of the sarcoma that we identified were {GRB2-, STAT3-} (i.e., both GRB2 and STAT3 as down-regulated and hyper-methylated, - denotes decreased gene activity, while + denotes increased activity), {STAT3+, TP53-, MAPK3+} (i.e., both STAT3 and MAPK3 as up-regulated and hypo-methylated & TP53 as down-regulated and hyper-methylated) andrn{STAT3+, FYN+, MAPK3+} (i.e., all the STAT3, FYN and MAPK3 as up-regulatedrnand hypo-methylated). In our comparison of our rule mining method with the existing rule mining approaches, we showed the superiority and efficiency of our method versus others, as our method generates fewer rules and lower mean of the shortest distance than the existing methods. In addition, we evaluated the markers by conducting KEGG pathway analyses as well as extensive literature search. In conclusion, our method is useful to extract complex markers from tri-omics profiles of the data for the complex disease or cellular conditions.
    Keywords: Tri-omics data; Multiple Minimum Supports/Confidences/Lifts; EmpiricalrnBayes Test; Weighted Shortest Distance; Complex marker.

  • Identification of temporal network changes in short-course gene expression from C. elegans reveals structural volatility   Order a copy of this article
    by Kathryn Cooper, Wail Hassan, Hesham Ali 
    Abstract: Many Bioinformatics algorithms attempt to extract relevant biological information from datasets obtained at specific data points. However, it is critical to identify changing genes in temporal data so that studies can focus on the dynamics of gene expression. While networks continue to play a significant role in characterizing biological relationships, most biomedical network modeling studies focus on static network-based analysis. In this study, we use a temporal, network-based approach to identify and rank genes that exhibit variation in short-course gene expression. We use a C. elegans gene correlation network obtained from mRNA expression to illustrate the value of the proposed approach, and compare the results of this method to results obtained from traditional differential gene expression analysis. We show that temporal network analysis identifies genes that are inherently different from differentially expressed genes, raising new questions about structural meaning in expression networks and how changes in expression are observed.
    Keywords: temporal network structural change; short-course gene expression; structural volatility; biological network modeling; differential gene expression.

  • Managing data provenance for bioinformatics workflows using AProvBio   Order a copy of this article
    by Rodrigo Almeida, Waldeyr Silva, Klayton Castro, Maria Emília Machado Telles Walter, Aletéia Patricia Favacho De Araújo, Sergio Lifschitz, Maristela Holanda 
    Abstract: Scientific experiments in bioinformatics are often executed as computational workflows. Data provenance involves documenting the history, and the paths of the input data, from the beginning to the end of an experiment. AProvBio is an architecture that enables the capture and storage of data provenance for bioinformatics workflows using the PROV-DM standard model. AProvBio works with three types of data provenance: prospect, retrospect, and the user-defined type. Given how graphs conveniently express PROV-DM, we have designed and implemented a simulator for storing the data provenance in a graph database system. This paper presents details and implementation aspects of our architecture, and an evaluation of AProvBio through the carrying out of two real case scenarios.
    Keywords: bioinformatics; scientific workflows; data provenance; PROV-DM; graph database.

  • Simulating genetically heterozygous genomes in the tumor tissue according to its clonal evolution history   Order a copy of this article
    by Yanshuo Chu, Mingxiang Teng, Yadong Wang 
    Abstract: Tumors contain multiple, genetically diverse subclonal populations of cells that have evolved from a single progenitor population through successive waves of expansion and selection. Currently, next-generation sequencing (NGS) and the third generation sequencing (TGS) have recently allowed us to develop algorithms to quantitatively dissect the extent of heterogeneity within a tumour, resolve cancer evolution history and identify the somatic variations and aneuploidy events with subclonal frequency. However, existing tumor NGS data has no ground truth annotation which is sufficient enough to validate all these NGS based tumor analysis algorithms. To benchmark these algorithms, a powerful tumor genome simluation tool which could simulate all the distinct subclonal genomes with diverse aneuploidy events and somatic variations according to the given tumor evolution history is in need. We provide a simulation package, Pysubsim-tree, which could simulate the tumor genomes according to their evolution history defined by the somatic variations and aneuploidy events. Pysubsim-tree is free, open source, available at:
    Keywords: Somatic variations; Cancer evolution history; Tumor heterogeneity; Tumor genome simulation.

  • Networks Regulated by Ginger towards Stomach and Small Intestine for Its Warming Interior Function   Order a copy of this article
    by Guang Zheng 
    Abstract: Ginger is widely used as both a cooking spice in east/south Asia and a traditional Chinese medicine (TCM) for its warming interior function, which is a TCM concept mainly referring to warming up the stomach and small intestine. This edible therapeutic function has been identified in the long history of TCM clinical and regimen practices. However, the underlying mechanism of warming interior is still obscure at protein regulating network level. In this study, for stomach and small intestine, 6-gingerol and 6-shaogaol, the gingers two bio-active compounds, are selected to initialize the underlying protein regulating networks. The first step is to identify the proteins targeted/regulated by ginger. These proteins were extracted from PubMed literatures and compound-protein databases. Starting with these targeted proteins, functional protein-protein interactions (FPPI) were selected to form the underlying regulating networks towards proteins expressed in stomach and small intestine. Further enrichment analysis of FPPI participating proteins (e.g. PPARG, GSK3B, GPD1, SCD, GSK3A, and IRS2) highlights five key metabolic processes on ATP, glycogen, coenzyme, glycerolipid and fatty acid. As a result, this FPPI network can be validated by PubMed literature, together with online bioinformatics tools of DAVID and KEGG. In brief, gingers warming interior function is elaborated via specified FPPI network. When activated by 6-gingerol and 6-shogaol, these five key metabolic processes can release more energy/heat to carry out the warming interior function identified in TCM towards ginger.
    Keywords: ginger; biological process; regulation network; warming interior; metabolic process.

  • A comparative study of multiclass feature selection on RNAseq and microarray data   Order a copy of this article
    by Silu Zhang, Junqing Wang, Keli Xu, Megan York, Yin-yuan Mo, Yixin Chen, Yunyun Zhou 
    Abstract: Gene expression profiles are widely used for identifying phenotype-specific biomarkers in clinical cancer research. By examining important gene features which specifically up or down-expressed in different phenotypes, clinicians can classify patients into different treatment group for precision medicine. Microarray and RNAseq are the two leading technology to measure gene expression data. However, due to the heterogeneity of two different experimental platforms, the gene signatures selected from the two platforms are different. However, there are limited studies comparatively investigated the classification performance of selected gene features from the two platforms. In this project, by using human breast cancer expression data as the example, we systematically compared the cancer subtype classification accuracies from the gene signatures selected from four popular multiclass feature selection algorithms and discussed the strength and weakness of selected genes across different experimental platforms and cohorts. Our results showed that the classification of selected genes performs best within the same platform even across the different cohorts. Our results suggested that merging the dataset belongs to the same platform will increase the statistical power and improve the prediction accuracy of the selected gene for multiclass classification analysis.
    Keywords: Systems biology; feature selection; breast cancer; cancer subtypes; machine learning; functional analysis; integration analysis; pathway.