Title: SeqTrim07: a pipeline for preprocessing sequence reads

Authors: Juan Falgueras, Antonio J. Lara, Guillermo Perez-Trabado, Noe Fernandez-Pozo, Francisco R. Canton, M. Gonzalo Claros

Addresses: Dep. Lenguajes y Ciencias de la Computacion, Universidad de Malaga, 29071 Malaga, Spain. ' Centro de Supercomputacion y Bioinformatica, Universidad de Malaga, 29071 Malaga, Spain. ' Dep. Arquitectura de Computadores, Universidad de Malaga, 29071 Malaga, Spain. ' Dep. Biologia Molecular y Bioq., Universidad de Malaga, 29071 Malaga, Spain. ' Dep. Biologia Molecular y Bioq., Universidad de Malaga, 29071 Malaga, Spain. ' Dep. Biologia Molecular y Bioq., Universidad de Malaga, 29071 Malaga, Spain

Abstract: SeqTrim is a pipeline designed to preprocessing sequence reads. It is easy to install and configure, flexible even if default parameters are accurate for most purposes and usable as a web interface or a standalone command line application. It identifies the sequence insert by removing low quality sequences, cloning vector, poly-A or poly-T tails, adaptors and any contaminant sequence or unwanted feature. Several input and output formats are available, which enables its inclusion in already or newly defined sequence processing work flows. It outperforms preprocessors implemented in other web servers and standalone applications at least in detecting adaptors and chimeric clones. SeqTrim is under continuous refinement to deal with most sequence events due to collaboration between biologists and computer scientists.

Keywords: preprocessing; sequence reads; chromatograms; assembly; poly-A+; poly-T+; quality; web interface; command line; workflow; bioinformatics; sequences; sequencing; adaptors; chimeric clones.

DOI: 10.1504/IJCIBSB.2010.038217

International Journal of Computational Intelligence in Bioinformatics and Systems Biology, 2010 Vol.1 No.4, pp.370 - 382

Published online: 23 Jan 2011 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article