SemBiosphere: a semantic web approach to recommending microarray clustering services

Pac Symp Biocomput. 2006:188-99.

Abstract

Clustering is a popular method for analyzing microarray data. Given the large number of clustering algorithms being available, it is difficult to identify the most suitable ones for a particular task. It is also difficult to locate, download, install and run the algorithms. This paper describes a matchmaking system, SemBiosphere, which solves both problems. It recommends clustering algorithms based on some minimal user requirement inputs and the data properties. An ontology was developed in OWL, an expressive ontological language, for describing what the algorithms are and how they perform, in addition to how they can be invoked. This allows machines to "understand" the algorithms and make the recommendations. The algorithm can be implemented by different groups and in different languages, and run on different platforms at geographically distributed sites. Through the use of XML-based web services, they can all be invoked in the same standard way. The current clustering services were transformed from the non-semantic web services of the Biosphere system, which includes a variety of algorithms that have been applied to microarray gene expression data analysis. New algorithms can be incorporated into the system without too much effort. The SemBiosphere system and the complete clustering ontology can be accessed at http://yeasthub2.gersteinlab. org/sembiosphere/.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Computational Biology
  • Internet*
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data*
  • Programming Languages
  • Semantics
  • Software