Using genome-referenced expressed sequence tag assembly to analyze the origin and expression patterns of Gossypium hirsutum transcripts

J Integr Plant Biol. 2013 Jul;55(7):576-85. doi: 10.1111/jipb.12066.

Abstract

We assembled a total of 297,239 Gossypium hirsutum (Gh, a tetraploid cotton, AADD) expressed sequence tag (EST) sequences that were available in the National Center for Biotechnology Information database, with reference to the recently published G. raimondii (Gr, a diploid cotton, DD) genome, and obtained 49,125 UniGenes. The average lengths of the UniGenes were increased from 804 and 791 bp in two previous EST assemblies to 1,019 bp in the current analysis. The number of putative cotton UniGenes with lengths of 3 kb or more increased from 25 or 34 to 1,223. As a result, thousands of originally independent G. hirsutum ESTs were aligned to produce large contigs encoding transcripts with very long open reading frames, indicating that the G. raimondii genome sequence provided remarkable advantages to assemble the tetraploid cotton transcriptome. Significant different distribution patterns within several GO terms, including transcription factor activity, were observed between D- and A-derived assemblies. Transcriptome analysis showed that, in a tetraploid cotton cell, 29,547 UniGenes were possibly derived from the D subgenome while another 19,578 may come from the A subgenome. Finally, some of the in silico data were confirmed by reverse transcription polymerase chain reaction experiments to show the changes in transcript levels for several gene families known to play key role in cotton fiber development. We believe that our work provides a useful platform for functional and evolutionary genomic studies in cotton.

Keywords: Cotton fiber; Gossypium; deep sequencing; expressed sequence tag assembly; functional genomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cellulose / biosynthesis
  • Ethylenes / biosynthesis
  • Expressed Sequence Tags / metabolism*
  • Gene Expression Regulation, Plant*
  • Gene Ontology
  • Genes, Plant / genetics
  • Genome, Plant / genetics*
  • Gossypium / genetics*
  • RNA, Messenger / genetics
  • RNA, Messenger / metabolism
  • Sequence Analysis, DNA
  • Statistics as Topic
  • Tetraploidy
  • Transcription Factors / metabolism
  • Transcriptome / genetics

Substances

  • Ethylenes
  • RNA, Messenger
  • Transcription Factors
  • Cellulose
  • ethylene