Variation Viewer: Frequently Asked Questions
- In the Variant Table, MAF is minor allele frequency. But for an rs with more than two alleles, which one is the minor allele?
- In the Variant Table, how is "1000G MAF" different from "GO-ESP MAF" and "ExAC MAF"?
- Why is the allele reported in "ExAC MAF" sometimes different from the allele in ExAC Browser?
- When I search by one rs#, why is the value under "Location" different from the exact location?
- When I search by one rs#, why is another rs# shown in the search result list?
- Why do I occasionally notice an rs# in the Variant Table, but not in the dbSNP data track in Sequence Viewer, or vice versa?
- Why do I occasionally notice an rs# in the Variant Table, but searching by the same rs# returns no result?
- Why does the same variation appear to be in both dbSNP and dbVar?
- I need data on GRCh37 coordinates; how do I find this information now that GRCh38 is the default?
- Where can I find more information about Variation Viewer and how to use it?
- What standard sequences are the variations annotated on?
In the Variant Table, MAF is minor allele frequency. But for an rs with more than two alleles, which one is the minor allele?
When there are more than two alleles at a variant location, the second most frequent is used to calculate MAF .
In the Variant Table, how is "1000G MAF" different from "GO-ESP MAF" and "ExAC MAF"?
- 1000G MAF is based on data from 1000 Genomes Project
- Datasource: ~84 million variants from the Phase 3 May 2013 call set
- Population: 1000G includes 2504 individuals from many different locations around the globe; see here for the specific 1000 Genomes populations.
- NCBI Resources: dbSNP , dbVar , and BioProject
- GO-ESP MAF is based on data from NHLBI Grand Opportunity Exome Sequencing Project (ESP)
- Datasource: ~1 million variants from the ESP 6500 exomes data set
- Population: GO-ESP includes some of the largest well-phenotyped populations in the United States, representing more than 200,000 individuals from participating studies .
- NCBI Resources: dbSNP , BioProject , and dbGaP
- ExAC MAF is based on data from the Exome Aggregation Consortium (ExAC)
- Datasource: ~9 million variants from ExAC Release 0.3
- Population: ExAC includes exome data from 60,706 individuals of diverse ethnicities ( Lek et al. 2015 )
- NCBI Resources: dbSNP and BioProject
Why is the allele reported in "ExAC MAF" sometimes different from the allele in ExAC Browser?
Each MAF column in the Variant Table reports the minor allele of the variant, and the minor allele frequency (MAF). The Broad's ExAC Browser reports the frequency of the alternate allele, which is usually but not always the minor allele.
For example, the ExAC MAF column for rs509504 in Variation Viewer reports " A = 0.0055 ". However, ExAC Browser shows rs509504 with an allele frequency of "0.9945", corresponding to the G allele . This is because the reference allele is "A", and the alternate allele is "G". Even though the reference allele "A" is the rare allele, ExAC Browser still reports the frequency of the "non-reference" (i.e. the alternate) allele.
Please note Variation Viewer flags the cases of a reference allele being the minor allele by showing the allele in bold.
When I search by one rs#, why is the value under "Location" different from the exact location?
The value listed under "Location" in the Search area is rounded to four significant digits. The actual position of the rs# is in the hover text. For example, for an rs at position 1298915, the value under "Location" says 1.299M - 1.299M.
When I search by one rs#, why is another rs# shown in the search result list?
For example, if searching by rs967798, rs184474 is shown in the result. That is because rs967798 has been merged into rs184474. The "Search result list" always shows the current rs#.
Why do I occasionally notice an rs# in the Variant Table, but not in the dbSNP data track in Sequence Viewer, or vice versa?
The Variant Table and dbSNP data track in Sequence Viewer (SViewer) should be in sync, but occasionally, an update in one component gets delayed. When this happens, you may notice that the same does not show up in both the Variant Table and SViewer's dbSNP track. If this happens, please report to us at help@ncbi.nlm.nih.gov. For any rs# in question, please check the dbSNP RefSNP page. For example, for rs328, use URL: http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=rs328
Why do I occasionally notice an dbSNP rs# in the Variant Table, but searching by the same rs# returns no result?
Occasionally, an rs# can be withdrawn after a curator validation. In this case, searching by the withdrawn rs# using the Variation Viewer search bar will return no result. But there is a delay to remove the same rs# entry from Variant Table. The Variation Viewer and dbSNP teams are working to shorten the time lag in data updates so this discrepancy does not happen again. Should you notice an issue, please email us at help@nbi.nlm.nih.gov. For any rs# in question, please check the dbSNP RefSNP page. If the rs# has been withdrawn, the RefSNP page will report the withdrawn event. For example, for rs757045552, use URL: http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=rs757045552
Why does the same variation appear to be in both dbSNP and dbVar?
There is a small set of identical variants in dbSNP and dbVar due to duplicated submission to both databases by the submitter.
Examples:
- http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=146978295
- http://www.ncbi.nlm.nih.gov/dbvar/variants/nsv393922/#tab-2
I need data on GRCh37 coordinates; how do I find this information now that GRCh38 is the default?
To see data in GRCh37 coordinates, select "GRCh37.p13" in the Pick Assembly at upper-left of the page.
Where can I find more information about Variation Viewer and how to use it?
The Help page describes Variation Viewer components and how to use them. It also has a link to an introductory video on YouTube.
What standard sequences are the variations annotated on?
Variations are mostly annotated on RefSeq sequences, but a small set of variants may be annotated on GenBank cDNA sequences.