Genome diversity of Epstein-Barr virus from multiple tumor types and normal infection

J Virol. 2015 May;89(10):5222-37. doi: 10.1128/JVI.03614-14. Epub 2015 Mar 18.

Abstract

Epstein-Barr virus (EBV) infects most of the world's population and is causally associated with several human cancers, but little is known about how EBV genetic variation might influence infection or EBV-associated disease. There are currently no published wild-type EBV genome sequences from a healthy individual and very few genomes from EBV-associated diseases. We have sequenced 71 geographically distinct EBV strains from cell lines, multiple types of primary tumor, and blood samples and the first EBV genome from the saliva of a healthy carrier. We show that the established genome map of EBV accurately represents all strains sequenced, but novel deletions are present in a few isolates. We have increased the number of type 2 EBV genomes sequenced from one to 12 and establish that the type 1/type 2 classification is a major feature of EBV genome variation, defined almost exclusively by variation of EBNA2 and EBNA3 genes, but geographic variation is also present. Single nucleotide polymorphism (SNP) density varies substantially across all known open reading frames and is highest in latency-associated genes. Some T-cell epitope sequences in EBNA3 genes show extensive variation across strains, and we identify codons under positive selection, both important considerations for the development of vaccines and T-cell therapy. We also provide new evidence for recombination between strains, which provides a further mechanism for the generation of diversity. Our results provide the first global view of EBV sequence variation and demonstrate an effective method for sequencing large numbers of genomes to further understand the genetics of EBV infection.

Importance: Most people in the world are infected by Epstein-Barr virus (EBV), and it causes several human diseases, which occur at very different rates in different parts of the world and are linked to host immune system variation. Natural variation in EBV DNA sequence may be important for normal infection and for causing disease. Here we used rapid, cost-effective sequencing to determine 71 new EBV sequences from different sample types and locations worldwide. We showed geographic variation in EBV genomes and identified the most variable parts of the genome. We identified protein sequences that seem to have been selected by the host immune system and detected variability in known immune epitopes. This gives the first overview of EBV genome variation, important for designing vaccines and immune therapy for EBV, and provides techniques to investigate relationships between viral sequence variation and EBV-associated diseases.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Antigens, Viral / genetics
  • Carrier State / virology
  • Cell Line, Tumor
  • DNA, Viral / genetics
  • Epitopes, T-Lymphocyte / genetics
  • Epstein-Barr Virus Infections / virology*
  • Epstein-Barr Virus Nuclear Antigens / genetics
  • Genetic Variation*
  • Genome, Viral*
  • Herpesvirus 4, Human / classification
  • Herpesvirus 4, Human / genetics*
  • Herpesvirus 4, Human / isolation & purification
  • Humans
  • Phylogeny
  • Polymorphism, Single Nucleotide
  • Recombination, Genetic
  • Viral Matrix Proteins / genetics

Substances

  • Antigens, Viral
  • DNA, Viral
  • EBV-associated membrane antigen, Epstein-Barr virus
  • Epitopes, T-Lymphocyte
  • Epstein-Barr Virus Nuclear Antigens
  • Viral Matrix Proteins