Int. J. of Data Mining and Bioinformatics   »   2013 Vol.7, No.2

 

 

Title: Towards a database for genotype-phenotype association research: mining data from encyclopaedia

 

Authors: Vesna S. Pajić; Gordana M. Pavlović-Lažetić; Miloš V. Beljanski; Bernd W. Brandt; Miloš B. Pajić

 

Addresses:
University of Belgrade, Faculty of Agriculture, Nemanjina 6, Zemun, Belgrade 11080, Serbia
University of Belgrade, Faculty of Mathematics, P.O.B. 550, Studentski trg 16, Belgrade 11001, Serbia
University of Belgrade, Institute of General and Physical Chemistry, P.O.B. 551, Studentski trg 16, Belgrade 11001, Serbia
Department of Preventive Dentistry, Academic Centre for Dentistry Amsterdam (ACTA), University of Amsterdam and VU University Amsterdam; Centre for Integrative Bioinformatics (IBIVU), VU University Amsterdam, The Netherlands
University of Belgrade, Faculty of Agriculture, Nemanjina 6, Zemun, Belgrade 11080, Serbia

 

Abstract: To associate phenotypic characteristics of an organism to molecules encoded by its genome, there is a need for well-structured genotype and phenotype data. We use a novel method for extracting data on phenotype and genotype characteristics of microorganisms from text. As a resource, we use an encyclopedia of microorganisms, which holds phenotypic and genotypic data and create a structured, flexible data resource, which can be exported to a range of database formats, containing genotype and phenotype data for 2412 species and 873 genera of microbes. This data source has great potential as a resource for future biological research on genotype-phenotype associations. In this paper, we focus on describing the structure and content of the resulting database and on evaluating the method used for extracting the data. We conclude that the resulting database can be used as a reliable complementary resource for research into genotype-phenotype association.

 

Keywords: biological databases; bioinformatics; genotype-phenotype association; text mining; information extraction; knowledge acquisition; knowledge discovery; unstructured textual resources; data mining; encyclopaedias.

 

DOI: 10.1504/IJDMB.2013.053196

 

Int. J. of Data Mining and Bioinformatics, 2013 Vol.7, No.2, pp.196 - 213

 

Available online: 29 Mar 2013

 

 

Editors Full text accessAccess for SubscribersPurchase this articleComment on this article