International Journal of Data Mining and Bioinformatics (6 papers in press)
Deep Learning Approaches in Electron Microscopy Imaging for Mitochondria Segmentation
by Ismail Oztel, Gozde Yolcu, İlker Ersoy, Tommi White, Filiz Bunyak
Abstract: Deep neural networks provide outstanding classification and detection accuracy in biomedical imaging applications. We present a study for mitochondria segmentation in electron microscopy (EM) images. Mitochondria play a significant role in cell cycle by generating the needed energy, and show quantifiable morphological differences with diseases such as cancer, metabolic disorders, and neurodegeneration. EM imaging allows researchers to observe the morphological changes in cells as part of disease process at a high resolution. Manual segmentation of mitochondria in large sequences of EM images is time consuming and prone to subjective delineation. Thus, manual segmentation may not provide the high accuracy needed for accurate quantification of morphological changes. We show that a convolutional neural network provides accurate mitochondria segmentation in CA1 hippocampus area of brain that is imaged by a focused ion beam scanning electron microscope (FIBSEM). We compare our results with other studies which are studied on the same data set and other deep neural network approaches and provide quantitative comparison.
Keywords: Deep Learning; Convolutional Neural Networks; Image Segmentation; Electron Microscopy; Mitochondria.
Identification of drug efficacy change using reconstructed network altered by SNPs on pathway gene
by Sukyung Seo, Taekeon Lee, Giup Jang, Soyoun Hwang, Youngmi Yoon
Abstract: Precision medicine is a medical approach tailored to individual patients. Recently, precision medicine has become more accessible, as the cost of analyzing a human genome is now below 1,000 USD. Since genetic variations like SNPs affect proteins like receptors, transporters, and enzymes, they cause variability of drug response between people. This can be a serious risk to both patients and clinicians. However, there are no network-based studies confirming the efficacy of drugs that have taken account of those genetic variations. This study presents a method to identify the change of drug efficacy for each pathway gene, which affects the mode of action of drugs. Adopting public genetic information, we construct two types of drug action-sub network with SNPs and without SNPs, and calculate the differences in drug efficacy. We expect that our model can be utilized for precision medicine by predicting the change in drug efficacy.
Keywords: drug efficacy; SNP; network reconstruction; genetic variants; precision medicine; transcription factor.
A Boolean network model of bacterial quorum-sensing systems
by Gonzalo A. Ruz, Ana Zuñiga, Eric Goles
Abstract: There are several mathematical models to represent gene regulatory networks, one of the simplest is the Boolean network paradigm. In this paper, we reconstruct a regulatory network of bacterial quorum-sensing systems, in particular, we consider Paraburkholderia phytofirmans PsJN which is a plant growth promoting bacteria that produces positive effects in horticultural crops like tomato, potato and grape. To learn the regulatory network from temporal expression pattern of quorum-sensing genes at root plants, we present a methodology that considers the training of perceptrons for each gene and then the integration into one Boolean regulatory network. Using the proposed approach, we were able to infer a regulatory network model whose topology and dynamic exhibited was helpful to gain insight on the quorum-sensing systems regulation mechanism. We compared our results with REVEAL and Best-Fit extension algorithm, showing that the proposed neural network approach obtained a more biologically meaningful network and dynamics, demonstrating the effectiveness of the proposed method.
Keywords: Gene Regulatory Networks; Quorum-Sensing Systems; Boolean Networks; Neural Networks; Network Inference.
CS-ABC: a Cooperative System based on Artificial Bee Colony to Resolve the DNA Fragment Assembly Problem
by Elamine ZEMALI, Abdelmadjid Boukra
Abstract: DNA Fragment Assembly Problem (DFA) is one of the most active research areas in bioinformatics. It consists in assembling a set of DNA fragments to determine the complete genome sequence. Because of the large number of fragments to assemble, this problem is classified as a NP-hard optimization problem. Thus, in order to deal with the large search space of such problem, we propose a new cooperative approach involving a set of metaheuristics. The proposed cooperative approach, named CS-ABC, is based on artificial bee colony algorithm. In this approach, metaheuristics cooperate as bees with artificial bee colony algorithm to improve the exploration and exploitation ability, forming a cooperative system. The use of a set of metaheuristics improves naturally the exploration ability since each one of them explores differently the search space. The communication between these metaheuristics is established through a shared memory. The exploitation is also enhanced by using different efficient DFA methods communicating according to the master-slave model. In the computational experiment we firstly, analyze the proposed method behavior resolving DFA problem. Then, we compare its performance against numerous DFA methods with noiseless and noisy data based on three models of error. The proposed method has obtained promising and encouraging results.
Keywords: DNA Fragment Assembly Problem; cooperation; metaheuristics; bioinformatics; ant colony system; biogeography based optimization ; Artificial Bee Colony algorithm.
CNV-LDC: An Optimized Method for Copy Number Variation Discovery in Low Depth of Coverage Data
by Ayyoub Salmi, Sara El Jadid, Ismail Jamail, Taoufik Bensellak, Romain Philippe, Veronique Blanquet, Ahmed Moussa
Abstract: Recent advances in sequencing technologies led to an increasing number of highly accurate ways of identifying and studying copy number variations (CNVs). Many methods and software packages have been developed for the detection of CNVs, generally these methods are based on four approaches: Assembly Based, Split Read, Read-Paired mapping and Read Depth. In this paper, we introduce an alternative method for detecting CNVs from short sequencing reads, CNV-LDC (Copy Number Variation-Low Depth of Coverage), that complements the existing method named CNV-TV (Copy Number Variation-Total Variation). To evaluate the performance of our method we compared it with some of the commonly used methods that are freely available and use different approaches to identify CNVs: Pindel, CNVnator and DELLY2. We used for this comparative study simulated data to gain control over deletions and duplications, then we used real data from the 1000 genome project to further test the quality of detected CNVs
Keywords: Copy Number Variation; NGS Data; Read Depth; Low Depth of Coverage.
Deriving Enhanced Geographical Representations via Similarity-based Spectral Analysis: Predicting Colorectal Cancer Survival Curves in Iowa
by Michael Lash, Min Zhang, Xun Zhou, Nick Street, Charles Lynch
Abstract: Neural networks are capable of learning rich, nonlinear feature representations shown to be beneficial in many predictive tasks. In this work, we use such models to explore different geographical feature representations in the context of predicting colorectal cancer survival curves for patients in the state of Iowa, spanning the years 1989 to 2013. Specifically, we compare model performance using "area between the curves" (ABC) to assess (a) whether survival curves can be reasonably predicted for colorectal cancer patients in the state of Iowa, (b) whether geographical features improve predictive performance, (c) whether a simple binary representation, or a richer, spectral analysis-elicited representation perform better, and (d) whether spectral analysis-based representations can be improved upon by leveraging geographically-descriptive features. In exploring (d), we devise a similarity-based spectral analysis procedure, which allows for the combination of geographically relational and geographically descriptive features. Our findings suggest that survival curves can be reasonably estimated on average, with predictive performance deviating at the five-year survival mark among all models. We also find that geographical features improve predictive performance, and that better performance is obtained using richer, spectral analysis-elicited features. Furthermore, we find that similarity-based spectral analysis-elicited representations improve upon the original spectral analysis results by approximately 40%.
Keywords: Geographical representations; Spectral analysis; Deep learning; Spectral clustering; Neural networks; Colorectal cancer; Survival curve.