New SNP Attributes

New attributes have been added to dbSNP to allow searching and filtering of human variation by the following characteristics.

Please contact snp-admin@ncbi.nlm.nih.gov if you have any questions or comments.

Attribute RS Count (Build 132)

Allele Origin:   The rs report summarizes the reported origin(s) of the variant allele asserted by each submitter for the submitted SNP (ss) . Current values are germline, somatic, and unknown.   Additional attributes will be added in the future release to include:

  • not-tested
  • tested-inconclusive
  • other

 

 

423

Clinical significance:   The significance of the indicated allele.

The supported values are:

  • unknown¬†
  • untested
  • non-pathogenic
  • probable-non-pathogenic
  • probable-pathogenic
  • pathogenic
  • drug-response
  • histocompatibility
  • other

 

 

13105

Global minor allele frequency (MAF):  dbSNP is reporting the minor allele frequency for each rs included in  a default global population. Since this is being provided to distinguish common polymorphism from rare variants, the MAF is actually the second most frequent allele value. In other words, if there are 3 alleles, with frequencies of 0.50, 0.49, and 0.01, the MAF will be reported as 0.49. The current default global population is 1000Genome phase 1 genotype data from 629 worldwide individuals, released in the 08-04-2010 dataset.

 

 

14946243

Suspect:  Variation suspected to be false positive due to artifacts of the presence of a paralogous sequence in the genome  (Musumeci et al. 2010) (Sudmant et al. 2010) or evidence suggested sequencing error or computation artifacts. 

 

105630

Below are examples of the attributes shown on web pages and schema changes.

RefSNP Summary (Example)

 
A. Allele Origin Indicated as Germline or Somatic for each allele
B. Clinical Significance Click on "VarView" or "OMIM" to view phenotype
C. Global MAF The minor allele (G), frequency (0.003), and allele count (3) is shown
D. Suspected A red "?" icon is shown for suspected SNP in the "Validation" row
 

SNP GeneView (Example)

A. Allele Origin Indicated as Germline or Somatic for each allele
B. Clinical Significance Click on icon under "Clinical Source"  to view effect in Variation Viewer
C. Global MAF MAF is shown for the corresponding allele
D. Suspected A red "?" icon is shown for suspected SNP under the "Validation" column
 

Variation Viewer (Example)

A. Allele Origin Indicated as Germline or Somatic for the variation under "Origin" column
B. Clinical Significance shown under "Clinical Intrepretation" column
C. Global MAF frequency is shown under minor allele frequency (MAF) column
D. Suspected A red "?" icon is shown for suspected SNP under the "Suspect" column
 

RS Docsum (XML Schema)

SNP Attribute XML Element
A. Allele Origin Rs/AlleleOrigin
B. Clinical Significance Rs/Phenotype
C. Global MAF Rs/Frequency
D. Suspected Rs/Validation/@name='suspect'
 

Entrez Search and Eutils Retrieval

The SNP attributes can be search from the web using the field and terms below or selected from the limit page. The search can also be performed programatically using eUtils eSearch. Note: eUtils eFetch is currently being updated to support retrieval of variations with the new attributes. We'll let you know when the update is done.

SNP Attribute Search Field (type) Search Terms
A. Allele Origin ALLELE_ORIGIN (text) somatic
    germline
B. Clinical Significance CLINICAL_SIGNIFICANCE (text) non pathogenic
    other
    pathogenic
    probable non pathogenic
    probable pathogenic
    unknown
C. Global MAF GLOBAL_MAF (floating-point) exact 0.01[GLOBAL_MAF] or range 0.01:0.05[GLOBAL_MAF])
D. Suspected SUSPECTED (text) paralog
 
VCF
The VCF four new tags of the attributes are described in the 000-README file on the FTP site (ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/v4.0/) and in the VCF header.
SNP Attribute VCF Tag
A. Allele Origin ##INFO=<ID=SAO,Number=1,Type=Integer,Description="SNP Allele Origin: 0 - unspecified, 1 - Germline, 2 - Somatic, 3 - Both">
B. Clinical Significance ##INFO=<ID=SCS,Number=1,Type=Integer,Description="SNP Clinical Significance, 0 - unknown, 1 - untested, 2 - non-pathogenic, 3 - probable-non-pathogenic, 4 - probable-pathogenic, 5 - pathogenic, 6 - drug-response, 7 - histocompatibility, 255 - other">
C. Global MAF ##INFO=<ID=GMAF,Number=1,Type=Float,Description="Global Minor Allele Frequency [0, 0.5]; global population is 1000GenomesProject phase 1 genotype data from 629 individuals, released in the 08-04-2010 dataset">
D. Suspected ##INFO=<ID=SSR,Number=1,Type=Integer,Description="SNP Suspect Reason Code, 0 - unspecified, 1 - Paralog, 2 - byEST, 3 - Para_EST, 4 - oldAlign, 5 - other">