Inferring activity changes of transcription factors by binding association with sorted expression profiles

BMC Bioinformatics. 2007 Nov 16:8:452. doi: 10.1186/1471-2105-8-452.

Abstract

Background: The identification of transcription factors (TFs) associated with a biological process is fundamental to understanding its regulatory mechanisms. From microarray data, however, the activity changes of TFs often cannot be directly observed due to their relatively low expression levels, post-transcriptional modifications, and other complications. Several approaches have been proposed to infer TF activity changes from microarray data. In some models, a linear relationship between gene expression and TF-gene binding strength is assumed. In some other models, the target genes of a TF are first determined by a significance cutoff to binding affinity scores, and then expression differentiation is checked between the target and other genes.

Results: We propose a novel method, referred to as BASE (binding association with sorted expression), to infer TF activity changes from microarray expression profiles with the help of binding affinity data. It searches the maximum association between bind affinity profile of a TF and expression change profile along the direction of sorted differentiation. The method does not make hard target gene selection, rather, the significances of TF activity changes are evaluated by permutation tests of binding association at the end. To show the effectiveness of this method, we apply it to three typical examples using different kinds of binding affinity data, namely, ChIP-chip data, motif discovery data, and positional weighted matrix scanning data, respectively. The implications obtained from all three examples are consistent with established biological results. Moreover, the inferences suggest new and biological meaningful hypotheses for further investigation.

Conclusion: The proposed method makes transcription inference from profiles of expression and binding affinity. The same machinery can be used to deal with various kinds of binding affinity data. The method does not require a linear assumption, and has the desirable property of scale-invariance with respect to TF-specific binding affinity. This method is easy to implement and can be routinely applied for transcriptional inferences in microarray studies.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Base Sequence
  • Binding Sites
  • Gene Expression Profiling / methods*
  • Molecular Sequence Data
  • Oligonucleotide Array Sequence Analysis / methods*
  • Protein Binding
  • Sequence Alignment
  • Sequence Analysis, DNA / methods*
  • Structure-Activity Relationship
  • Transcription Factors / chemistry*
  • Transcription Factors / genetics*

Substances

  • Transcription Factors