Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library

Nat Biotechnol. 2010 Jan;28(1):47-55. doi: 10.1038/nbt.1600. Epub 2009 Dec 27.

Abstract

Structural variants (SVs) are a major source of human genomic variation; however, characterizing them at nucleotide resolution remains challenging. Here we assemble a library of breakpoints at nucleotide resolution from collating and standardizing ~2,000 published SVs. For each breakpoint, we infer its ancestral state (through comparison to primate genomes) and its mechanism of formation (e.g., nonallelic homologous recombination, NAHR). We characterize breakpoint sequences with respect to genomic landmarks, chromosomal location, sequence motifs and physical properties, finding that the occurrence of insertions and deletions is more balanced than previously reported and that NAHR-formed breakpoints are associated with relatively rigid, stable DNA helices. Finally, we demonstrate an approach, BreakSeq, for scanning the reads from short-read sequenced genomes against our breakpoint library to accurately identify previously overlooked SVs, which we then validate by PCR. As new data become available, we expect our BreakSeq approach will become more sensitive and facilitate rapid SV genotyping of personal genomes.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Bias
  • Chromosome Breakpoints*
  • Chromosome Mapping
  • Gene Library*
  • Genetic Loci / genetics
  • Genetic Variation*
  • Humans
  • Nucleotides / genetics*
  • Phylogeny
  • Primates / genetics
  • Sequence Analysis, DNA / methods*

Substances

  • Nucleotides