NCBI Jatropha curcas Annotation Release 101

The RefSeq genome records for Jatropha curcas were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction
Comparison of the current and previous annotations: What proportion of the genes changed in this annotation

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Jatropha curcas Annotation Release 101

Annotation release ID: 101
Date of Entrez queries for transcripts and proteins: Mar 31 2017
Date of submission of annotation to the public databases: Apr 6 2017
Software version: 7.3

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
JatCur_1.0	GCF_000696525.1	Chinese Academy of Sciences	06-02-2014	Reference	1 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	JatCur_1.0
Genes and pseudogenes	24,517
protein-coding	21,735
non-coding	1,884
pseudogenes	898
genes with variants	6,285
mRNAs	32,463
fully-supported	30,433
with > 5% ab initio	1,558
partial	457
with filled gap(s)	209
known RefSeq (NM_)	216
model RefSeq (XM_)	32,247
Other RNAs	3,756
fully-supported	3,309
with > 5% ab initio	0
partial	6
with filled gap(s)	5
known RefSeq (NR_)	0
model RefSeq (XR_)	3,321
CDSs	32,463
fully-supported	30,433
with > 5% ab initio	1,597
partial	375
with major correction(s)	297
known RefSeq (NP_)	216
model RefSeq (XP_)	32,247

Detailed reports

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	23,619	4,073	2,919	71	96,853
All transcripts	36,219	1,926	1,697	62	17,199
mRNA	32,463	1,979	1,734	166	17,199
misc_RNA	1,522	2,338	2,086	185	13,508
tRNA	435	74	73	71	88
lncRNA	1,799	1,068	801	62	8,141
Single-exon transcripts	3,405	1,356	1,164	166	5,969
coding transcripts (NM_/XM_ )	3,402	1,357	1,164	166	5,969
non-coding transcripts (NR_/XR_ )	3	878	784	362	1,487
CDSs	32,463	1,459	1,218	114	16,764
Exons	147,634	320	171	1	7,932
in coding transcripts (NM_/XM_ )	141,014	317	168	1	7,932
in non-coding transcripts (NR_/XR_ )	12,740	301	164	2	5,954
Introns	118,595	536	201	30	95,695
in coding transcripts (NM_/XM_ )	114,554	522	198	30	95,695
in non-coding transcripts (NR_/XR_ )	10,000	693	251	31	33,798

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	1.54	1	1	24
Number of exons per transcript	6.76	5	1	79

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the Arabidopsis thaliana known RefSeq proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 21735 coding genes, 20301 genes had a protein with an alignment covering 50% or more of the query and 10168 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: Arabidopsis thaliana known RefSeq proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker for each assembly. RepeatMasker results are only used for organisms for which a comprehensive repeat library is available.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with RepeatMasker	% Masked with WindowMasker
JatCur_1.0	GCF_000696525.1	2.90%	27.54%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Depending on the other evidence available, long 454 reads (with average length above 250 nt) may be aligned as traditional evidence and reported in the Transcript alignments section or aligned with RNA-Seq reads and reported in the RNA-Seq alignments section.

Transcript alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by Splign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species known RefSeq (NM_/NR_)	216	216 (100.00%)	215 (99.54%)	99.34%	99.61%
Same-species Genbank	579	565 (97.58%)	541 (93.44%)	99.54%	98.73%
Same-species EST	46,859	42,617 (90.95%)	39,154 (83.56%)	99.35%	99.04%

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Publication	Track name	Number of reads	Percent aligned reads	Percent spliced reads	Number of introns
All	NA	Aggregate of all aligned samples	4,170,400,291	94%	18%	137,564
SAMEA2342024	NA	Jatropha curcas; Jatropha curcas_RNA-seq (Jatropha curcas, SAMEA2342024)	5,271,080	47%	2%	39,930
SAMN00003846	NA	developing seed (Jatropha curcas, SAMN00003846)	195,692	47%	20%	18,164
SAMN00188808	21492485	roots, mature leaves, flowers, developing seeds, embryos (Jatropha curcas, SAMN00188808)	383,937	66%	33%	71,038
SAMN01894327	24349370	General Sample from untreated and cold-treated seedlings of Jatropha curcas L. (Jatropha curcas, SAMN01894327)	55,112,142	84%	13%	98,195
SAMN02350404	NA	immature seeds (Jatropha curcas, SAMN02350404)	16,653,188	91%	22%	92,895
SAMN02350405	NA	intermediate seeds (Jatropha curcas, SAMN02350405)	43,328,830	91%	23%	110,426
SAMN02350406	NA	mature seeds (Jatropha curcas, SAMN02350406)	35,062,185	88%	20%	102,943
SAMN02356859	NA	Sample for Jatropha curcas 454 sequencing (Jatropha curcas, SAMN02356859)	1,714,433	87%	51%	101,295
SAMN02905749	NA	Leaf, resistant control (Jatropha curcas, missing, SAMN02905749)	90,560,792	91%	22%	113,409
SAMN02905750	NA	Leaf, resistant induced (Jatropha curcas, missing, SAMN02905750)	74,923,274	90%	21%	110,673
SAMN02905751	NA	Leaf, susceptible control (Jatropha curcas, missing, SAMN02905751)	102,338,758	92%	22%	115,838
SAMN02905752	NA	Leaf, susceptible induced (Jatropha curcas, missing, SAMN02905752)	76,993,984	89%	21%	112,165
SAMN03019548	26602946	leaf, 0d drought stress (Jatropha curcas, SAMN03019548)	88,070,540	93%	9%	110,806
SAMN03019549	26602946	leaf, 0d drought stress (Jatropha curcas, SAMN03019549)	87,797,814	93%	10%	112,289
SAMN03019550	26602946	leaf, 13d control (Jatropha curcas, SAMN03019550)	73,315,468	91%	9%	110,750
SAMN03019551	26602946	leaf, 13d control (Jatropha curcas, SAMN03019551)	70,667,001	92%	9%	110,069
SAMN03019552	26602946	leaf, 13d drought stress (Jatropha curcas, SAMN03019552)	93,745,945	93%	9%	113,831
SAMN03019553	26602946	leaf, 13d drought stress (Jatropha curcas, SAMN03019553)	78,073,310	93%	9%	111,772
SAMN03019554	26602946	root, 13d control (Jatropha curcas, SAMN03019554)	84,092,467	92%	9%	113,008
SAMN03019555	26602946	root, 13d control (Jatropha curcas, SAMN03019555)	72,662,825	89%	8%	107,489
SAMN03019556	26602946	root, 13d drought stress (Jatropha curcas, SAMN03019556)	100,132,641	90%	8%	115,717
SAMN03019557	26602946	root, 13d drought stress (Jatropha curcas, SAMN03019557)	100,662,504	92%	9%	115,769
SAMN03019558	26602946	leaf, 49d drought stress (Jatropha curcas, SAMN03019558)	72,056,684	92%	8%	108,835
SAMN03019559	26602946	leaf, 49d drought stress (Jatropha curcas, SAMN03019559)	78,727,880	92%	8%	109,719
SAMN03019560	26602946	root, 49d drought stress (Jatropha curcas, SAMN03019560)	74,931,868	90%	9%	109,700
SAMN03019561	26602946	root, 49d drought stress (Jatropha curcas, SAMN03019561)	79,298,403	91%	8%	110,151
SAMN03019562	26602946	leaf, 49-52d control (Jatropha curcas, SAMN03019562)	79,702,442	92%	8%	109,635
SAMN03019563	26602946	leaf, 49-52d control (Jatropha curcas, SAMN03019563)	70,721,106	93%	9%	108,739
SAMN03019564	26602946	root, 49-52d control (Jatropha curcas, SAMN03019564)	84,644,526	88%	9%	115,115
SAMN03019565	26602946	root, 49-52d control (Jatropha curcas, SAMN03019565)	70,694,588	89%	9%	113,008
SAMN03019566	26602946	leaf, 52d drought stress (Jatropha curcas, SAMN03019566)	130,362,442	93%	9%	114,229
SAMN03019567	26602946	leaf, 52d drought stress (Jatropha curcas, SAMN03019567)	58,747,250	93%	9%	106,516
SAMN03019568	26602946	root, 52d drought stress (Jatropha curcas, SAMN03019568)	67,247,442	91%	8%	111,934
SAMN03019569	26602946	root, 52d drought stress (Jatropha curcas, SAMN03019569)	90,398,094	92%	9%	113,974
SAMN03145040	NA	Leaf; Root; flower (Jatropha curcas, SAMN03145040)	167,085,070	93%	23%	128,499
SAMN03152301	25400171	Inflorescences (Jatropha curcas, SAMN03152301)	703,755	83%	43%	60,215
SAMN03160711	NA	seed (Jatropha curcas, SAMN03160711)	59,591,536	90%	16%	108,787
SAMN03486846	NA	leaf (Jatropha curcas, seedling, SAMN03486846)	54,987,425	182%	40%	114,645
SAMN03733282	NA	flower buds (Jatropha curcas, SAMN03733282)	54,996,594	93%	18%	112,857
SAMN05827448	NA	Shoot (Jatropha curcas, 2 years, SAMN05827448)	58,434,958	96%	28%	115,357
SAMN05827449	NA	Inflorescences (Jatropha curcas, 2 years, SAMN05827449)	46,322,163	96%	28%	118,009
SAMN05827450	NA	Shoot (Jatropha curcas, 2 years, SAMN05827450)	51,203,205	96%	28%	113,012
SAMN05827451	NA	Inflorescences (Jatropha curcas, 2 years, SAMN05827451)	56,062,944	96%	29%	119,717
SAMN05827452	NA	Shoot (Jatropha curcas, 2 years, SAMN05827452)	50,863,218	96%	27%	114,291
SAMN05827453	NA	Inflorescences (Jatropha curcas, 2 years, SAMN05827453)	66,298,912	96%	28%	120,145
SAMN05827454	NA	Shoot (Jatropha curcas, 2 years, SAMN05827454)	42,699,070	96%	28%	113,221
SAMN05827455	NA	Shoot (Jatropha curcas, 2 years, SAMN05827455)	47,423,637	96%	28%	112,619
SAMN05827456	NA	Shoot (Jatropha curcas, 2 years, SAMN05827456)	50,405,875	96%	27%	112,363
SAMN05827457	NA	Inflorescences (Jatropha curcas, 2 years, SAMN05827457)	58,202,433	96%	28%	113,676
SAMN05827458	NA	Shoot (Jatropha curcas, 2 years, SAMN05827458)	46,218,305	96%	26%	113,049
SAMN05827459	NA	Shoot (Jatropha curcas, 2 years, SAMN05827459)	57,567,401	95%	27%	112,034
SAMN05827460	NA	Shoot (Jatropha curcas, 2 years, SAMN05827460)	48,485,694	96%	27%	111,426
SAMN05827461	NA	Inflorescences (Jatropha curcas, 2 years, SAMN05827461)	51,981,297	96%	27%	115,699
SAMN05827462	NA	Shoot (Jatropha curcas, 2 years, SAMN05827462)	45,134,569	96%	28%	114,148
SAMN05827463	NA	Shoot (Jatropha curcas, 2 years, SAMN05827463)	42,352,532	96%	28%	112,214
SAMN05827464	NA	Shoot (Jatropha curcas, 2 years, SAMN05827464)	54,573,642	96%	29%	115,276
SAMN05827465	NA	Inflorescences (Jatropha curcas, 2 years, SAMN05827465)	44,640,267	96%	28%	118,289
SAMN05949192	NA	flower inflorescence buds, monoecious, 3-4d (Jatropha curcas, Two years, SAMN05949192)	49,065,896	94%	29%	118,130
SAMN05949193	NA	flower inflorescence buds, monoecious, 3-4d (Jatropha curcas, Two years, SAMN05949193)	49,392,394	94%	28%	116,543
SAMN05949194	NA	flower inflorescence buds, monoecious, 3-4d (Jatropha curcas, Two years, SAMN05949194)	49,668,810	94%	29%	117,954
SAMN05949195	NA	flower inflorescence buds, monoecious, 8-9d (Jatropha curcas, Two years, SAMN05949195)	39,256,618	94%	28%	116,357
SAMN05949196	NA	flower inflorescence buds, monoecious, 8-9d (Jatropha curcas, Two years, SAMN05949196)	54,242,280	94%	28%	119,488
SAMN05949197	NA	flower inflorescence buds, monoecious, 8-9d (Jatropha curcas, Two years, SAMN05949197)	42,788,554	93%	27%	115,615
SAMN05949198	NA	flower inflorescence buds, gynoecious, 3-4d (Jatropha curcas, Two years, SAMN05949198)	51,850,822	92%	28%	118,015
SAMN05949199	NA	flower inflorescence buds, gynoecious, 3-4d (Jatropha curcas, Two years, SAMN05949199)	48,715,656	93%	29%	117,239
SAMN05949200	NA	flower inflorescence buds, gynoecious, 3-4d (Jatropha curcas, Two years, SAMN05949200)	62,050,746	93%	29%	119,709
SAMN05949201	NA	flower inflorescence buds, gynoecious, 8-9d (Jatropha curcas, Two years, SAMN05949201)	53,003,790	91%	25%	114,352
SAMN05949202	NA	flower inflorescence buds, gynoecious, 8-9d (Jatropha curcas, Two years, SAMN05949202)	53,585,986	92%	27%	116,818
SAMN05949203	NA	flower inflorescence buds, gynoecious, 8-9d (Jatropha curcas, Two years, SAMN05949203)	51,252,702	93%	28%	120,565

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent spliced reads
ERR420522	ERX386815	ERP004701	SAMEA2342024	5,271,080	47%	2%
SRR027577	SRX011411	SRP001241	SAMN00003846	195,692	47%	20%
SRR087417	SRX035761	SRP004898	SAMN00188808	383,937	66%	33%
SRR653198	SRX220056	SRP018233	SAMN01894327	55,112,142	84%	13%
SRR972445	SRX346819	SRP029638	SAMN02350404	16,653,188	91%	22%
SRR972446	SRX346820	SRP029638	SAMN02350405	43,328,830	91%	23%
SRR972447	SRX346821	SRP029638	SAMN02350406	35,062,185	88%	20%
SRR998547	SRX352166	SRP029977	SAMN02356859	1,621,838	87%	51%
SRR998927	SRX352166	SRP029977	SAMN02356859	92,595	91%	43%
SRR1539206	SRX672142	SRP044808	SAMN02905749	90,560,792	91%	22%
SRR1560724	SRX689084	SRP044808	SAMN02905750	74,923,274	90%	21%
SRR1560722	SRX689082	SRP044808	SAMN02905751	102,338,758	92%	22%
SRR1560723	SRX689083	SRP044808	SAMN02905752	76,993,984	89%	21%
SRR1565783	SRX692835	SRP046221	SAMN03019548	88,070,540	93%	9%
SRR1565784	SRX692836	SRP046221	SAMN03019549	87,797,814	93%	10%
SRR1565785	SRX692837	SRP046221	SAMN03019550	73,315,468	91%	9%
SRR1565786	SRX692838	SRP046221	SAMN03019551	70,667,001	92%	9%
SRR1565787	SRX692839	SRP046221	SAMN03019552	93,745,945	93%	9%
SRR1565788	SRX692840	SRP046221	SAMN03019553	78,073,310	93%	9%
SRR1565789	SRX692841	SRP046221	SAMN03019554	84,092,467	92%	9%
SRR1565790	SRX692842	SRP046221	SAMN03019555	72,662,825	89%	8%
SRR1565791	SRX692843	SRP046221	SAMN03019556	100,132,641	90%	8%
SRR1565792	SRX692844	SRP046221	SAMN03019557	100,662,504	92%	9%
SRR1565793	SRX692845	SRP046221	SAMN03019558	72,056,684	92%	8%
SRR1565794	SRX692846	SRP046221	SAMN03019559	78,727,880	92%	8%
SRR1565795	SRX692847	SRP046221	SAMN03019560	74,931,868	90%	9%
SRR1565796	SRX692848	SRP046221	SAMN03019561	79,298,403	91%	8%
SRR1565797	SRX692849	SRP046221	SAMN03019562	79,702,442	92%	8%
SRR1565798	SRX692850	SRP046221	SAMN03019563	70,721,106	93%	9%
SRR1565799	SRX692851	SRP046221	SAMN03019564	84,644,526	88%	9%
SRR1565800	SRX692852	SRP046221	SAMN03019565	70,694,588	89%	9%
SRR1565801	SRX692853	SRP046221	SAMN03019566	130,362,442	93%	9%
SRR1565802	SRX692854	SRP046221	SAMN03019567	58,747,250	93%	9%
SRR1565803	SRX692855	SRP046221	SAMN03019568	67,247,442	91%	8%
SRR1565804	SRX692856	SRP046221	SAMN03019569	90,398,094	92%	9%
SRR1635040	SRX744542	SRP049319	SAMN03145040	85,386,014	93%	23%
SRR1635045	SRX747345	SRP049319	SAMN03145040	81,699,056	94%	24%
SRR1639661	SRX750581	SRP049486	SAMN03160711	59,591,536	90%	16%
SRR1663442	SRX763342	SRP050030	SAMN03152301	182,722	87%	50%
SRR1663443	SRX768574	SRP050030	SAMN03152301	172,682	92%	46%
SRR1663444	SRX768575	SRP050030	SAMN03152301	181,699	68%	28%
SRR1663445	SRX768576	SRP050030	SAMN03152301	166,652	86%	47%
SRR2102905	SRX1097498	SRP057220	SAMN03486846	28,493,022	180%	38%
SRR2101829	SRX997124	SRP057220	SAMN03486846	26,494,403	183%	42%
SRR2039597	SRX1037655	SRP058680	SAMN03733282	54,996,594	93%	18%
SRR4308595	SRX2200465	SRP090662	SAMN05827448	58,434,958	96%	28%
SRR4426289	SRX2248218	SRP090662	SAMN05827449	46,322,163	96%	28%
SRR4426290	SRX2248219	SRP090662	SAMN05827450	51,203,205	96%	28%
SRR4426295	SRX2248224	SRP090662	SAMN05827451	56,062,944	96%	29%
SRR4426296	SRX2248225	SRP090662	SAMN05827452	50,863,218	96%	27%
SRR4426312	SRX2248241	SRP090662	SAMN05827453	66,298,912	96%	28%
SRR4426313	SRX2248242	SRP090662	SAMN05827454	42,699,070	96%	28%
SRR4426314	SRX2248243	SRP090662	SAMN05827455	47,423,637	96%	28%
SRR4449122	SRX2248244	SRP090662	SAMN05827456	50,405,875	96%	27%
SRR4446116	SRX2248245	SRP090662	SAMN05827457	58,202,433	96%	28%
SRR4449125	SRX2267588	SRP090662	SAMN05827458	46,218,305	96%	26%
SRR4449140	SRX2267590	SRP090662	SAMN05827459	57,567,401	95%	27%
SRR4449144	SRX2267604	SRP090662	SAMN05827460	48,485,694	96%	27%
SRR4449147	SRX2267606	SRP090662	SAMN05827461	51,981,297	96%	27%
SRR4449148	SRX2267607	SRP090662	SAMN05827462	45,134,569	96%	28%
SRR4449151	SRX2267609	SRP090662	SAMN05827463	42,352,532	96%	28%
SRR4449154	SRX2267612	SRP090662	SAMN05827464	54,573,642	96%	29%
SRR4449157	SRX2267614	SRP090662	SAMN05827465	44,640,267	96%	28%
SRR4473571	SRX2279490	SRP092157	SAMN05949192	49,065,896	94%	29%
SRR4473572	SRX2279491	SRP092157	SAMN05949193	49,392,394	94%	28%
SRR4473565	SRX2279484	SRP092157	SAMN05949194	49,668,810	94%	29%
SRR4473566	SRX2279485	SRP092157	SAMN05949195	39,256,618	94%	28%
SRR4473567	SRX2279486	SRP092157	SAMN05949196	54,242,280	94%	28%
SRR4473568	SRX2279487	SRP092157	SAMN05949197	42,788,554	93%	27%
SRR4473569	SRX2279488	SRP092157	SAMN05949198	51,850,822	92%	28%
SRR4473570	SRX2279489	SRP092157	SAMN05949199	48,715,656	93%	29%
SRR4473575	SRX2279494	SRP092157	SAMN05949200	62,050,746	93%	29%
SRR4473576	SRX2279495	SRP092157	SAMN05949201	53,003,790	91%	25%
SRR4473573	SRX2279492	SRP092157	SAMN05949202	53,585,986	92%	27%
SRR4473574	SRX2279493	SRP092157	SAMN05949203	51,252,702	93%	28%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species GenBank	422	419 (99.29%)	419 (99.29%)	80.60%	88.48%
Same-species known RefSeq (NP_)	216	215 (99.54%)	215 (99.54%)	81.86%	88.36%
Cucumis melo high-quality model RefSeq (XP_)	11,901	11,704 (98.34%)	11,704 (98.34%)	69.05%	77.79%
Cucumis melo known RefSeq (NP_)	115	109 (94.78%)	109 (94.78%)	69.64%	83.77%
Arabidopsis thaliana known RefSeq (NP_)	48,148	42,672 (88.63%)	42,672 (88.63%)	66.59%	71.12%
Glycine max high-quality model RefSeq (XP_)	22,495	21,880 (97.27%)	21,880 (97.27%)	68.40%	77.06%
Glycine max known RefSeq (NP_)	6,862	6,583 (95.93%)	6,583 (95.93%)	70.23%	77.87%
Eucalyptus grandis high-quality model RefSeq (XP_)	16,668	16,021 (96.12%)	16,021 (96.12%)	67.49%	76.80%
Eucalyptus grandis known RefSeq (NP_)	37	36 (97.30%)	36 (97.30%)	75.44%	82.26%
Populus euphratica high-quality model RefSeq (XP_)	18,422	18,043 (97.94%)	18,043 (97.94%)	70.91%	81.42%
Populus euphratica known RefSeq (NP_)	39	39 (100.00%)	39 (100.00%)	67.48%	80.94%

Comparison of the current and previous annotations

The annotation produced for this release (101) was compared to the annotation in the previous release (100) for each assembly annotated in both releases. Scores for current and previous gene and transcript features were calculated based on overlap in exon sequence and matches in exon boundaries. Pairs of current and previous features were categorized based on these scores, whether they are reciprocal best matches, and changes in attributes (gene biotype, completeness, etc.). If the assembly was updated between the two releases, alignments between the current and the previous assembly were used to match the current and previous gene and transcript features in mapped regions.

The table below summarizes the changes in the gene set for each assembly as a percent of the number of genes in the current annotation release, and provides links to the details of the comparison in tabular format and in a Genome Workbench project.

	JatCur_1.0 (Current) to JatCur_1.0 (Previous)
Identical	5%
Minor changes	74%
Major changes	11%
New	8%
Deprecated	5%
Other	2%
Download the report	tabular, Genome Workbench

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20

RefSeq

Integrated reference sequences