SiPaGene: A new repository for instant online retrieval, sharing and meta-analyses of GeneChip expression data

BMC Genomics. 2009 Mar 5:10:98. doi: 10.1186/1471-2164-10-98.

Abstract

Background: Microarray expression profiling is becoming a routine technology for medical research and generates enormous amounts of data. However, reanalysis of public data and comparison with own results is laborious. Although many different tools exist, there is a need for more convenience and online analysis with restriction of access and user specific sharing options. Furthermore, most of the currently existing tools do not use the whole range of statistical power provided by the MAS5.0/GCOS algorithms.

Description: With a current focus on immunology, infection, inflammation, tissue regeneration and cancer we developed a database platform that can load preprocessed Affymetrix GeneChip expression data for immediate access. Group or subgroup comparisons can be calculated online, retrieved for candidate genes, transcriptional activity in various biological conditions and compared with different experiments. The system is based on Oracle 9i with algorithms in java and graphical user interfaces implemented as java servlets. Signals, detection calls, signal log ratios, change calls and corresponding p-values were calculated with MAS5.0/GCOS algorithms. MIAME information and gene annotations are provided via links to GEO and EntrezGene. Users access via https protocol their own, shared or public data. Sharing is comparison- and user-specific with different levels of rights. Arrays for group comparisons can be selected individually. Twenty-two different group comparison parameters can be applied in user-defined combinations on single or multiple group comparisons. Identified genes can be reviewed online or downloaded. Optimized selection criteria were developed and reliability was demonstrated with the "Latin Square" data set. Currently more than 1,000 arrays, 10,000 pairwise comparisons and 500 group comparisons are presented with public or restricted access by different research networks or individual users.

Conclusion: SiPaGene is a repository and a high quality tool for primary analysis of GeneChips. It exploits the MAS5.0/GCOS pairwise comparison algorithm, enables restricted access and user specific sharing. It does not aim for a complete representation of all public arrays but for high quality analysis with stepwise integration of reference signatures for detailed meta-analyses. Development of additional tools like functional annotation networks based on expression information will be future steps towards a systematic biological analysis of expression profiles.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology
  • Database Management Systems*
  • Databases, Genetic*
  • Gene Expression Profiling / methods*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Sequence Analysis, DNA
  • Software
  • User-Computer Interface