NCBI Lacerta agilis Annotation Release 100

The RefSeq genome records for Lacerta agilis were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Lacerta agilis Annotation Release 100

Annotation release ID: 100
Date of Entrez queries for transcripts and proteins: Mar 24 2020
Date of submission of annotation to the public databases: Mar 28 2020
Software version: 8.4

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
rLacAgi1.pri	GCF_009819535.1	Vertebrate Genomes Project	12-31-2019	Reference	21 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	rLacAgi1.pri
Genes and pseudogenes	23,190
protein-coding	20,325
non-coding	2,335
transcribed pseudogenes	0
non-transcribed pseudogenes	491
genes with variants	7,258
immunoglobulin/T-cell receptor gene segments	39
other	0
mRNAs	39,254
fully-supported	36,295
with > 5% ab initio	1,427
partial	166
with filled gap(s)	1
known RefSeq (NM_)	0
model RefSeq (XM_)	39,254
non-coding RNAs	3,050
fully-supported	1,998
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	2,447
pseudo transcripts	0
fully-supported	0
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	0
CDSs	39,306
fully-supported	36,295
with > 5% ab initio	1,588
partial	166
with major correction(s)	1,267
known RefSeq (NP_)	0
model RefSeq (XP_)	39,267

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	22,660	32,207	14,105	53	1,400,662
All transcripts	42,304	2,751	2,183	53	95,557
mRNA	39,254	2,909	2,301	168	95,557
misc_RNA	513	2,509	2,202	155	13,563
tRNA	601	73	73	64	84
lncRNA	1,485	541	424	77	5,280
snoRNA	229	113	98	53	318
snRNA	176	131	114	60	196
guide_RNA	19	177	137	86	382
rRNA	27	202	119	119	1,520
Single-exon transcripts	1,634	1,205	960	171	8,150
coding transcripts (NM_/XM_ )	1,634	1,205	960	171	8,150
CDSs	39,267	2,122	1,530	96	94,302
Exons	229,345	235	134	1	20,202
in coding transcripts (NM_/XM_ )	224,695	236	135	1	20,202
in non-coding transcripts (NR_/XR_ )	9,060	184	122	2	5,452
Introns	206,482	4,090	1,375	30	962,069
in coding transcripts (NM_/XM_ )	203,393	4,056	1,371	30	962,069
in non-coding transcripts (NR_/XR_ )	7,348	5,256	1,532	30	428,357

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	1.89	1	1	50
Number of exons per transcript	12.82	10	1	257

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the UniProtKB/Swiss-Prot curated proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 20312 coding genes, 19611 genes had a protein with an alignment covering 50% or more of the query and 12280 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: UniProtKB/Swiss-Prot curated proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker for each assembly. RepeatMasker results are only used for organisms for which a comprehensive repeat library is available.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with RepeatMasker	% Masked with WindowMasker
rLacAgi1.pri	GCF_009819535.1	5.31%	34.05%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Depending on the other evidence available, long 454 reads (with average length above 250 nt) may be aligned as traditional evidence and reported in the Transcript alignments section or aligned with RNA-Seq reads and reported in the RNA-Seq alignments section.

Transcript alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by Splign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species Genbank	33	33 (100.00%)	33 (100.00%)	99.52%	96.96%

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Publication	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	NA	Aggregate of all aligned samples	8,010,791,362	50%	31%	262,930
SAMD00043884	NA	liver (Takydromus tachydromoides, female, SAMD00043884)	53,655,734	37%	37%	123,101
SAMEA104469066	NA	Heart (Lacerta viridis, female, SAMEA104469066)	22,431,718	39%	28%	101,044
SAMEA104469067	NA	Brain (Lacerta viridis, female, SAMEA104469067)	22,127,836	31%	37%	101,362
SAMEA104469068	NA	Ovary (Lacerta viridis, female, SAMEA104469068)	34,425,986	51%	30%	149,146
SAMEA104469069	NA	Liver (Lacerta viridis, female, SAMEA104469069)	50,989,276	48%	30%	108,239
SAMEA104469164	NA	Kidney (Lacerta viridis, female, SAMEA104469164)	22,635,102	42%	22%	108,086
SAMEA104469200	NA	Heart (Lacerta bilineata, female, SAMEA104469200)	21,144,564	40%	28%	116,091
SAMEA104469201	NA	Brain (Lacerta bilineata, female, SAMEA104469201)	26,086,626	45%	23%	141,841
SAMEA104469202	NA	Ovary (Lacerta bilineata, female, SAMEA104469202)	21,095,476	51%	30%	142,192
SAMEA104469203	NA	Kidney (Lacerta bilineata, female, SAMEA104469203)	31,775,946	43%	35%	123,915
SAMEA104469204	NA	Liver (Lacerta bilineata, female, SAMEA104469204)	29,020,828	48%	35%	107,295
SAMN04452394	26980341	liver (Takydromus sexlineatus, female, SAMN04452394)	370,591,310	32%	10%	108,094
SAMN04958257	28953924	Brain-228 (Podarcis siculus, Adult, male, SAMN04958257)	32,997,952	46%	22%	141,204
SAMN04958258	28953924	Testis-228 (Podarcis siculus, Adult, male, SAMN04958258)	43,430,340	54%	30%	145,720
SAMN04958259	28953924	Brain-227 (Podarcis siculus, Adult, male, SAMN04958259)	37,235,310	43%	21%	140,205
SAMN04958260	28953924	Testis-227 (Podarcis siculus, Adult, male, SAMN04958260)	42,161,434	54%	30%	149,335
SAMN04958261	28953924	Brain-224 (Podarcis siculus, Adult, male, SAMN04958261)	37,769,874	47%	22%	143,886
SAMN04958262	28953924	Testis-224 (Podarcis siculus, Adult, male, SAMN04958262)	32,666,588	48%	29%	140,574
SAMN04958263	28953924	Brain-256 (Podarcis siculus, Adult, male, SAMN04958263)	53,658,720	47%	24%	154,726
SAMN04958264	28953924	Testis-256 (Podarcis siculus, Adult, male, SAMN04958264)	31,541,380	53%	30%	142,804
SAMN04958265	28953924	Brain-273 (Podarcis siculus, Adult, male, SAMN04958265)	37,701,762	47%	23%	143,159
SAMN04958266	28953924	Testis-273 (Podarcis siculus, Adult, male, SAMN04958266)	41,029,640	56%	30%	143,395
SAMN04958267	28953924	Testis-278 (Podarcis siculus, Adult, male, SAMN04958267)	43,749,114	50%	31%	150,064
SAMN04958268	28953924	Brain-278 (Podarcis siculus, Adult, male, SAMN04958268)	38,155,290	48%	24%	144,187
SAMN07357074	NA	stage 27, embryonic tissue (Podarcis muralis, SAMN07357074)	3,815,219,130	58%	32%	224,750
SAMN09111777	NA	Brain (Podarcis siculus, not determined, SAMN09111777)	31,854,430	23%	15%	54,548
SAMN09111778	NA	Brain (Podarcis siculus, not determined, SAMN09111778)	25,658,956	41%	23%	109,523
SAMN09111779	NA	Brain (Podarcis siculus, not determined, SAMN09111779)	6,379,952	6%	14%	12,078
SAMN10786382	30819892	Brain (Podarcis muralis, Adult, male, SAMN10786382)	221,584,952	52%	39%	182,453
SAMN10786383	30819892	Duodenum (Podarcis muralis, Adult, male, SAMN10786383)	123,490,538	43%	36%	147,584
SAMN10786384	30819892	Embryo (incubated at 15C) (Podarcis muralis, SAMN10786384)	73,459,248	55%	27%	167,037
SAMN10786385	30819892	Embryo (incubated at 24C) (Podarcis muralis, SAMN10786385)	66,936,500	57%	28%	164,331
SAMN10786386	30819892	Muscle (Podarcis muralis, Adult, male, SAMN10786386)	132,361,306	63%	51%	146,221
SAMN10786387	30819892	Skin (Podarcis muralis, Adult, male, SAMN10786387)	130,934,858	52%	46%	134,177
SAMN10786388	30819892	Testis (Podarcis muralis, Adult, male, SAMN10786388)	206,297,322	48%	42%	188,181
SAMN11775984	NA	mix of various organs (Iberolacerta bonnali, not collected, not collected, SAMN11775984)	95,216,156	56%	24%	146,330
SAMN11775986	NA	mix of various organs (Holaspis guentheri, not collected, not collected, SAMN11775986)	73,563,916	35%	29%	102,936
SAMN11775987	NA	mix of various organs (Darevskia parvula, not collected, not collected, SAMN11775987)	124,815,964	56%	24%	154,262
SAMN11775988	NA	tail fin (Takydromus sexlineatus, not collected, not collected, SAMN11775988)	50,455,702	49%	29%	105,711
SAMN11775989	NA	mix of various organs (Timon pater, not collected, not collected, SAMN11775989)	50,726,868	45%	16%	106,587
SAMN11775990	NA	mix of various organs (Lacerta agilis, not collected, not collected, SAMN11775990)	45,944,760	81%	22%	152,614
SAMN11775991	NA	mix of various organs (Podarcis muralis, not collected, not collected, SAMN11775991)	46,782,124	35%	23%	124,558
SAMN11775992	NA	mix of various organs (Podarcis liolepis, not collected, not collected, SAMN11775992)	76,766,548	50%	25%	163,754
SAMN11775994	NA	mix of various organs (Zootoca vivipara, not collected, not collected, SAMN11775994)	51,868,584	46%	24%	148,123
SAMN11775995	NA	mix of various organs (Hellenolacerta graeca, not collected, not collected, SAMN11775995)	41,576,428	31%	18%	42,203
SAMN11775997	NA	mix of various organs (Scelarcis perspicillata, not collected, not collected, SAMN11775997)	96,652,332	35%	23%	121,840
SAMN11775998	NA	mix of various organs (Dinarolacerta mosorensis, not collected, not collected, SAMN11775998)	81,609,940	42%	22%	146,989
SAMN11775999	NA	mix of various organs (Dalmatolacerta oxycephala, not collected, not collected, SAMN11775999)	72,818,992	46%	23%	149,986
SAMN11776000	NA	mix of various organs (Algyroides nigropunctatus, not collected, not collected, SAMN11776000)	83,106,780	39%	23%	142,250
SAMN11776001	NA	mix of various organs (Archaeolacerta bedriagae, not collected, not collected, SAMN11776001)	81,032,822	51%	25%	144,914
SAMN11776002	NA	mix of various organs (Phoenicolacerta laevis, not collected, not collected, SAMN11776002)	89,195,856	37%	20%	127,974
SAMN11776003	NA	mix of various organs (Apathya cappadocica, not collected, not collected, SAMN11776003)	68,864,996	36%	24%	123,747
SAMN11776004	NA	mix of various organs (Iranolacerta brandtii, not collected, not collected, SAMN11776004)	52,163,696	49%	21%	140,553
SAMN14329711	NA	whole (Zootoca vivipara, female, SAMN14329711)	715,373,900	25%	39%	187,187

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
DRR072216	DRX066166	DRP003799	SAMD00043884	53,655,734	37%	37%
ERR2245308	ERX2297331	ERP105987	SAMEA104469066	7,158,284	37%	27%
ERR2245309	ERX2297332	ERP105987	SAMEA104469066	6,290,530	33%	27%
ERR2245310	ERX2297333	ERP105987	SAMEA104469066	8,982,904	44%	28%
ERR2245315	ERX2297338	ERP105987	SAMEA104469067	6,971,246	31%	37%
ERR2245316	ERX2297339	ERP105987	SAMEA104469067	6,054,582	27%	37%
ERR2245317	ERX2297340	ERP105987	SAMEA104469067	9,102,008	34%	38%
ERR2245318	ERX2297341	ERP105987	SAMEA104469068	10,823,590	51%	30%
ERR2245319	ERX2297342	ERP105987	SAMEA104469068	9,319,974	46%	29%
ERR2245320	ERX2297343	ERP105987	SAMEA104469068	14,282,422	55%	30%
ERR2245323	ERX2297346	ERP105987	SAMEA104469069	16,546,834	47%	30%
ERR2245324	ERX2297347	ERP105987	SAMEA104469069	14,176,364	43%	30%
ERR2245325	ERX2297348	ERP105987	SAMEA104469069	20,266,078	53%	30%
ERR2245328	ERX2297351	ERP105987	SAMEA104469164	7,285,922	41%	22%
ERR2245329	ERX2297352	ERP105987	SAMEA104469164	6,266,306	37%	21%
ERR2245330	ERX2297353	ERP105987	SAMEA104469164	9,082,874	47%	22%
ERR2245428	ERX2297451	ERP105987	SAMEA104469200	6,517,154	40%	27%
ERR2245429	ERX2297452	ERP105987	SAMEA104469200	5,646,484	35%	27%
ERR2245430	ERX2297453	ERP105987	SAMEA104469200	8,980,926	43%	28%
ERR2245431	ERX2297454	ERP105987	SAMEA104469201	8,180,908	44%	23%
ERR2245432	ERX2297455	ERP105987	SAMEA104469201	7,263,200	39%	23%
ERR2245433	ERX2297456	ERP105987	SAMEA104469201	10,642,518	49%	24%
ERR2245434	ERX2297457	ERP105987	SAMEA104469202	6,531,330	51%	30%
ERR2245435	ERX2297458	ERP105987	SAMEA104469202	5,682,690	46%	30%
ERR2245436	ERX2297459	ERP105987	SAMEA104469202	8,881,456	55%	30%
ERR2245437	ERX2297460	ERP105987	SAMEA104469203	10,238,296	42%	34%
ERR2245438	ERX2297461	ERP105987	SAMEA104469203	8,559,482	39%	34%
ERR2245439	ERX2297462	ERP105987	SAMEA104469203	12,978,168	48%	35%
ERR2245440	ERX2297463	ERP105987	SAMEA104469204	9,146,668	48%	35%
ERR2245441	ERX2297464	ERP105987	SAMEA104469204	7,989,542	43%	35%
ERR2245442	ERX2297465	ERP105987	SAMEA104469204	11,884,618	53%	36%
SRR3139467	SRX1557463	SRP069196	SAMN04452394	370,591,310	32%	10%
SRR3479613	SRX1745110	SRP074471	SAMN04958257	32,997,952	46%	22%
SRR3479614	SRX1745111	SRP074471	SAMN04958258	43,430,340	54%	30%
SRR3479617	SRX1745114	SRP074471	SAMN04958259	37,235,310	43%	21%
SRR3479618	SRX1745115	SRP074471	SAMN04958260	42,161,434	54%	30%
SRR3479619	SRX1745116	SRP074471	SAMN04958261	37,769,874	47%	22%
SRR3479620	SRX1745117	SRP074471	SAMN04958262	32,666,588	48%	29%
SRR3479621	SRX1745118	SRP074471	SAMN04958263	53,658,720	47%	24%
SRR3479622	SRX1745119	SRP074471	SAMN04958264	31,541,380	53%	30%
SRR3479623	SRX1745120	SRP074471	SAMN04958265	37,701,762	47%	23%
SRR3479624	SRX1745121	SRP074471	SAMN04958266	41,029,640	56%	30%
SRR3479615	SRX1745112	SRP074471	SAMN04958267	43,749,114	50%	31%
SRR3479616	SRX1745113	SRP074471	SAMN04958268	38,155,290	48%	24%
SRR5859200	SRX3027915	SRP113322	SAMN07357074	73,157,278	58%	31%
SRR5859199	SRX3027916	SRP113322	SAMN07357074	55,444,660	59%	32%
SRR5859198	SRX3027917	SRP113322	SAMN07357074	90,046,060	58%	31%
SRR5859197	SRX3027918	SRP113322	SAMN07357074	96,179,472	58%	32%
SRR5859196	SRX3027919	SRP113322	SAMN07357074	92,764,264	57%	31%
SRR5859195	SRX3027920	SRP113322	SAMN07357074	74,583,024	58%	31%
SRR5859194	SRX3027921	SRP113322	SAMN07357074	84,400,858	57%	31%
SRR5859193	SRX3027922	SRP113322	SAMN07357074	79,370,330	57%	31%
SRR5859192	SRX3027923	SRP113322	SAMN07357074	67,083,300	57%	31%
SRR5859191	SRX3027924	SRP113322	SAMN07357074	68,740,840	59%	32%
SRR5859190	SRX3027925	SRP113322	SAMN07357074	81,706,202	56%	31%
SRR5859189	SRX3027926	SRP113322	SAMN07357074	67,923,468	58%	32%
SRR5859188	SRX3027927	SRP113322	SAMN07357074	82,503,882	55%	30%
SRR5859187	SRX3027928	SRP113322	SAMN07357074	58,375,766	58%	32%
SRR5859186	SRX3027929	SRP113322	SAMN07357074	59,359,256	58%	31%
SRR5859185	SRX3027930	SRP113322	SAMN07357074	71,510,008	59%	32%
SRR5859184	SRX3027931	SRP113322	SAMN07357074	72,000,668	54%	30%
SRR5859183	SRX3027932	SRP113322	SAMN07357074	75,083,470	57%	31%
SRR5859182	SRX3027933	SRP113322	SAMN07357074	89,473,174	56%	31%
SRR5859181	SRX3027934	SRP113322	SAMN07357074	95,076,398	58%	32%
SRR5859180	SRX3027935	SRP113322	SAMN07357074	78,428,506	59%	32%
SRR5859179	SRX3027936	SRP113322	SAMN07357074	75,940,192	56%	31%
SRR5859178	SRX3027937	SRP113322	SAMN07357074	68,846,346	59%	32%
SRR5859177	SRX3027938	SRP113322	SAMN07357074	83,867,066	57%	31%
SRR5859176	SRX3027939	SRP113322	SAMN07357074	81,841,870	58%	31%
SRR5859175	SRX3027940	SRP113322	SAMN07357074	79,742,126	57%	31%
SRR5859174	SRX3027941	SRP113322	SAMN07357074	84,880,736	59%	32%
SRR5859173	SRX3027942	SRP113322	SAMN07357074	85,900,298	57%	31%
SRR5859172	SRX3027943	SRP113322	SAMN07357074	72,698,116	59%	33%
SRR5859171	SRX3027944	SRP113322	SAMN07357074	78,207,734	57%	32%
SRR5859170	SRX3027945	SRP113322	SAMN07357074	68,807,746	58%	32%
SRR5859169	SRX3027946	SRP113322	SAMN07357074	55,711,142	59%	32%
SRR5859168	SRX3027947	SRP113322	SAMN07357074	98,093,700	59%	32%
SRR5859167	SRX3027948	SRP113322	SAMN07357074	97,897,732	56%	31%
SRR5859166	SRX3027949	SRP113322	SAMN07357074	71,078,052	58%	32%
SRR5859165	SRX3027950	SRP113322	SAMN07357074	80,554,972	57%	32%
SRR5859164	SRX3027951	SRP113322	SAMN07357074	91,027,766	58%	32%
SRR5859163	SRX3027952	SRP113322	SAMN07357074	86,983,018	57%	31%
SRR5859162	SRX3027953	SRP113322	SAMN07357074	79,499,744	58%	32%
SRR5859161	SRX3027954	SRP113322	SAMN07357074	92,977,414	59%	32%
SRR5859160	SRX3027955	SRP113322	SAMN07357074	83,519,916	57%	31%
SRR5859159	SRX3027956	SRP113322	SAMN07357074	85,845,364	59%	32%
SRR5859158	SRX3027957	SRP113322	SAMN07357074	92,276,678	57%	31%
SRR5859157	SRX3027958	SRP113322	SAMN07357074	67,463,734	59%	32%
SRR5859156	SRX3027959	SRP113322	SAMN07357074	81,547,650	57%	31%
SRR5859155	SRX3027960	SRP113322	SAMN07357074	84,720,982	58%	32%
SRR5859154	SRX3027961	SRP113322	SAMN07357074	89,139,232	56%	31%
SRR5859153	SRX3027962	SRP113322	SAMN07357074	82,938,920	59%	32%
SRR7152529	SRX4071992	SRP145442	SAMN09111777	31,854,430	23%	15%
SRR7152530	SRX4071991	SRP145442	SAMN09111778	25,658,956	41%	23%
SRR7152531	SRX4071990	SRP145442	SAMN09111779	6,379,952	6%	14%
SRR8468525	SRX5274934	SRP181081	SAMN10786382	221,584,952	52%	39%
SRR8468526	SRX5274933	SRP181081	SAMN10786383	123,490,538	43%	36%
SRR8468527	SRX5274932	SRP181081	SAMN10786384	73,459,248	55%	27%
SRR8468528	SRX5274931	SRP181081	SAMN10786385	66,936,500	57%	28%
SRR8468521	SRX5274938	SRP181081	SAMN10786386	132,361,306	63%	51%
SRR8468522	SRX5274937	SRP181081	SAMN10786387	130,934,858	52%	46%
SRR8468523	SRX5274936	SRP181081	SAMN10786388	206,297,322	48%	42%
SRR9090245	SRX5865404	SRP198890	SAMN11775984	95,216,156	56%	24%
SRR9090243	SRX5865406	SRP198890	SAMN11775986	73,563,916	35%	29%
SRR9090242	SRX5865407	SRP198890	SAMN11775987	124,815,964	56%	24%
SRR9090241	SRX5865408	SRP198890	SAMN11775988	50,455,702	49%	29%
SRR9090240	SRX5865409	SRP198890	SAMN11775989	50,726,868	45%	16%
SRR9090239	SRX5865410	SRP198890	SAMN11775990	45,944,760	81%	22%
SRR9090248	SRX5865401	SRP198890	SAMN11775991	46,782,124	35%	23%
SRR9090247	SRX5865402	SRP198890	SAMN11775992	76,766,548	50%	25%
SRR9090256	SRX5865393	SRP198890	SAMN11775994	51,868,584	46%	24%
SRR9090255	SRX5865394	SRP198890	SAMN11775995	41,576,428	31%	18%
SRR9090257	SRX5865392	SRP198890	SAMN11775997	96,652,332	35%	23%
SRR9090252	SRX5865397	SRP198890	SAMN11775998	81,609,940	42%	22%
SRR9090251	SRX5865398	SRP198890	SAMN11775999	72,818,992	46%	23%
SRR9090254	SRX5865395	SRP198890	SAMN11776000	83,106,780	39%	23%
SRR9090253	SRX5865396	SRP198890	SAMN11776001	81,032,822	51%	25%
SRR9090250	SRX5865399	SRP198890	SAMN11776002	89,195,856	37%	20%
SRR9090249	SRX5865400	SRP198890	SAMN11776003	68,864,996	36%	24%
SRR9090238	SRX5865411	SRP198890	SAMN11776004	52,163,696	49%	21%
SRR11262332	SRX7868979	SRP251946	SAMN14329711	206,643,160	22%	36%
SRR11262331	SRX7868980	SRP251946	SAMN14329711	248,854,838	30%	39%
SRR11262330	SRX7868981	SRP251946	SAMN14329711	259,875,902	23%	39%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Pogona vitticeps high-quality model RefSeq (XP_)	13,733	13,141 (95.69%)	13,141 (95.69%)	72.46%	82.47%
Protobothrops mucrosquamatus high-quality model RefSeq (XP_)	6,763	6,363 (94.09%)	6,363 (94.09%)	72.04%	85.93%
Python bivittatus high-quality model RefSeq (XP_)	12,616	12,015 (95.24%)	12,015 (95.24%)	72.42%	83.46%
Anolis carolinensis high-quality model RefSeq (XP_)	13,146	12,220 (92.96%)	12,220 (92.96%)	70.17%	81.77%
Same-species GenBank	33	32 (96.97%)	32 (96.97%)	73.19%	88.86%
Xenopus GenBank	31,816	8,809 (27.69%)	8,809 (27.69%)	67.85%	75.64%
Xenopus known RefSeq (NP_)	19,656	18,654 (94.90%)	18,654 (94.90%)	68.99%	79.28%
Sauropsida GenBank	29,326	16,910 (57.66%)	16,910 (57.66%)	67.67%	75.61%
Sauropsida known RefSeq (NP_)	8,124	7,556 (93.01%)	7,556 (93.01%)	71.41%	81.55%
Homo sapiens GenBank	144,512	73,908 (51.14%)	73,908 (51.14%)	61.07%	73.23%
Homo sapiens known RefSeq (NP_)	56,808	39,685 (69.86%)	39,685 (69.86%)	69.25%	77.50%

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20

RefSeq

Integrated reference sequences