Title: A novel strategy for molecular signature discovery based on independent component analysis

Authors: Hang-Phuong Pham; Nicolas Dérian; Wahiba Chaara; Bertrand Bellier; David Klatzmann; Adrien Six

Addresses: Immunology, Immunopathology, Immunotherapy, UPMC Univ Paris 06, UMR 7211, F-75013 Paris, France; Immunology, Immunopathology, Immunotherapy, CNRS, UMR 7211, F-75013 Paris, France ' Immunology, Immunopathology, Immunotherapy, UPMC Univ Paris 06, UMR 7211, F-75013 Paris, France; Immunology, Immunopathology, Immunotherapy, CNRS, UMR 7211, F-75013 Paris, France ' Immunology, Immunopathology, Immunotherapy, UPMC Univ Paris 06, UMR 7211, F-75013 Paris, France; Immunology, Immunopathology, Immunotherapy, CNRS, UMR 7211, F-75013 Paris, France ' Immunology, Immunopathology, Immunotherapy, UPMC Univ Paris 06, UMR 7211, F-75013 Paris, France; Immunology, Immunopathology, Immunotherapy, CNRS, UMR 7211, F-75013 Paris, France; Immunology, Immunopathology, Immunotherapy, INSERM, U959, F-75013 Paris, France ' Immunology, Immunopathology, Immunotherapy, UPMC Univ Paris 06, UMR 7211, F-75013 Paris, France; Immunology, Immunopathology, Immunotherapy, CNRS, UMR 7211, F-75013 Paris, France; Immunology, Immunopathology, Immunotherapy, INSERM, U959, F-75013 Paris, France ' Immunology, Immunopathology, Immunotherapy, UPMC Univ Paris 06, UMR 7211, F-75013 Paris, France; Immunology, Immunopathology, Immunotherapy, CNRS, UMR 7211, F-75013 Paris, France

Abstract: Microarray analysis often leads to either too large or too small numbers of gene candidates to allow meaningful identification of functional signatures. We aimed at overcoming this hurdle by combining two algorithms: i) Independent Component Analysis to extract statistically-based potential signatures. ii) Gene Set Enrichment Analysis to produce a score of enrichment with statistical significance of each potential signature. We have applied this strategy to identify regulatory T cell (Treg) molecular signatures from two experiments in mice, with cross-validation. These signatures can detect the ∼1% Treg in whole spleen. These findings demonstrate the relevance of our approach as a signature discovery tool.

Keywords: data mining; bioinformatics; statistical modelling; transcriptome; gene expression; microarray data analysis; GSEA; gene set enrichment analysis; ICA; independent component analysis; T lymphocyte; regulatory T cell; Treg; molecular signatures; signature discovery.

DOI: 10.1504/IJDMB.2014.060052

International Journal of Data Mining and Bioinformatics, 2014 Vol.9 No.3, pp.277 - 304

Received: 01 Jan 2011
Accepted: 01 Feb 2012

Published online: 21 Oct 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article