NCBI Manihot esculenta Annotation Release 101

The RefSeq genome records for Manihot esculenta were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
BUSCO results: Annotation completeness assessed with BUSCO
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction
Similarity of current and previous assembly: The similarity of the current and previous assembly
Comparison of the current and previous annotations: What proportion of the genes changed in this annotation

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Manihot esculenta Annotation Release 101

Annotation release ID: 101
Date of Entrez queries for transcripts and proteins: Sep 20 2021
Date of submission of annotation to the public databases: Sep 28 2021
Software version: 9.0

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
M.esculenta_v8	GCF_001659605.2	DOE-Joint Genome Institute	08-13-2021	Reference	20 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	M.esculenta_v8
Genes and pseudogenes	34,312
protein-coding	29,735
non-coding	3,366
Transcribed pseudogenes	9
Non-transcribed pseudogenes	1,199
genes with variants	9,970
Immunoglobulin/T-cell receptor gene segments	0
other	3
mRNAs	49,175
fully-supported	45,851
with > 5% ab initio	2,768
partial	113
with filled gap(s)	0
known RefSeq (NM_)	0
model RefSeq (XM_)	49,175
non-coding RNAs	7,991
fully-supported	6,044
with > 5% ab initio	0
partial	1
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	7,108
pseudo transcripts	9
fully-supported	9
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	9
CDSs	49,290
fully-supported	45,851
with > 5% ab initio	2,837
partial	113
with major correction(s)	233
known RefSeq (NP_)	0
model RefSeq (XP_)	49,290

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	33,104	4,500	2,912	37	480,524
All transcripts	57,166	2,051	1,771	37	20,283
mRNA	49,175	2,091	1,793	141	20,283
misc_RNA	3,826	2,515	2,158	162	14,177
tRNA	873	74	73	37	90
lncRNA	2,221	1,805	1,207	116	10,159
snoRNA	358	104	103	63	218
snRNA	100	151	160	101	200
rRNA	613	1,079	156	103	3,403
Single-exon transcripts	4,541	1,395	1,200	198	6,262
coding transcripts (NM_/XM_ )	4,541	1,395	1,200	198	6,262
CDSs	49,290	1,503	1,233	90	20,283
Exons	206,164	341	175	1	20,268
in coding transcripts (NM_/XM_ )	195,464	334	172	1	20,268
in non-coding transcripts (NR_/XR_ )	24,175	331	150	1	9,627
Introns	165,089	685	197	30	108,848
in coding transcripts (NM_/XM_ )	158,297	661	194	30	108,848
in non-coding transcripts (NR_/XR_ )	19,866	876	230	30	96,380

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	1.75	1	1	50
Number of exons per transcript	7.02	5	1	79

BUSCO analysis of gene annotation

BUSCO v4.1.4 (Simão et al 2015, PMID: 26059717) was run in "protein" mode on the annotated gene set picking one longest protein per gene, and run using the eudicots_odb10 lineage dataset. Results are reported for the gene set from the primary assembly unit, and presented in BUSCO notation (C:complete [S:single-copy, D:duplicated], F:fragmented, M:missing, n:number of genes used).

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the Arabidopsis thaliana known RefSeq proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 29620 coding genes, 27343 genes had a protein with an alignment covering 50% or more of the query and 13601 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: Arabidopsis thaliana known RefSeq proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker (if calculated), for each assembly. RepeatMasker results are only calculated for organisms with complete Dfam HMM model collections.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with WindowMasker
M.esculenta_v8	GCF_001659605.2	55.53%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign, minimap2, or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Transcript alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by Splign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species Genbank	1,046	1,034 (98.85%)	948 (90.63%)	99.66%	99.52%
Same-species EST	122,660	113,981 (92.92%)	108,492 (88.45%)	99.17%	98.69%

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Publication	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	NA	Aggregate of all aligned samples	6,955,847,003	66%	31%	204,192
SAMEA7532454	NA	Fibrous roots Time Point 1 (Manihot esculenta subsp. esculenta, SAMEA7532454)	229,651,164	45%	27%	169,797
SAMEA7532455	NA	Fibrous roots Time Point 3 (Manihot esculenta subsp. esculenta, SAMEA7532455)	191,343,342	40%	26%	166,231
SAMEA7532456	NA	Fibrous roots Time Point 7 (Manihot esculenta subsp. esculenta, SAMEA7532456)	175,530,656	39%	24%	160,809
SAMEA7532457	NA	Storage roots Time Point 1 (Manihot esculenta subsp. esculenta, SAMEA7532457)	180,772,008	42%	27%	164,963
SAMEA7532458	NA	Storage roots Time Point 3 (Manihot esculenta subsp. esculenta, SAMEA7532458)	158,497,306	43%	26%	162,290
SAMEA7532459	NA	Storage roots Time Point 7 (Manihot esculenta subsp. esculenta, SAMEA7532459)	295,128,706	26%	25%	158,517
SAMEA7532460	NA	Sink leaf Time Point 1 (Manihot esculenta subsp. esculenta, SAMEA7532460)	221,053,276	33%	27%	165,489
SAMEA7532461	NA	Sink leaf Time Point 3 (Manihot esculenta subsp. esculenta, SAMEA7532461)	289,897,478	31%	27%	162,284
SAMEA7532462	NA	Sink leaf Time Point 7 (Manihot esculenta subsp. esculenta, SAMEA7532462)	202,564,792	37%	24%	162,732
SAMEA7532463	NA	Source leaf Time Point 1 (Manihot esculenta subsp. esculenta, SAMEA7532463)	205,138,574	25%	26%	154,232
SAMEA7532464	NA	Source leaf Time Point 3 (Manihot esculenta subsp. esculenta, SAMEA7532464)	235,668,064	24%	26%	148,138
SAMEA7532465	NA	Source leaf Time Point 7 (Manihot esculenta subsp. esculenta, SAMEA7532465)	258,964,084	25%	25%	155,805
SAMN02263987	NA	leaf, stem (Manihot esculenta, SAMN02263987)	205,405	77%	56%	53,805
SAMN02263988	NA	leaf, stem (Manihot esculenta, SAMN02263988)	346,000	82%	58%	71,448
SAMN02263989	NA	leaf, stem (Manihot esculenta, SAMN02263989)	208,365	79%	56%	54,164
SAMN02263990	NA	leaf, stem (Manihot esculenta, SAMN02263990)	291,438	78%	54%	60,150
SAMN02263991	NA	leaf, stem (Manihot esculenta, SAMN02263991)	295,774	79%	58%	52,509
SAMN02263992	NA	leaf, stem (Manihot esculenta, SAMN02263992)	261,139	79%	57%	56,329
SAMN02263993	NA	leaf, stem (Manihot esculenta, SAMN02263993)	190,325	81%	57%	44,913
SAMN02263994	NA	leaf, stem (Manihot esculenta, SAMN02263994)	299,198	80%	59%	59,830
SAMN02263996	NA	leaf, stem (Manihot esculenta, SAMN02263996)	50,550	74%	53%	17,379
SAMN02263999	NA	leaf, stem (Manihot esculenta, SAMN02263999)	232,379	31%	38%	18,009
SAMN02444908	25120000	stem (Manihot esculenta, SAMN02444908)	34,253,318	52%	20%	133,931
SAMN02444909	25120000	stem (Manihot esculenta, SAMN02444909)	17,083,990	76%	19%	128,217
SAMN02444910	25120000	stem (Manihot esculenta, SAMN02444910)	17,570,286	64%	18%	122,631
SAMN02444911	25120000	stem (Manihot esculenta, SAMN02444911)	7,519,304	26%	19%	69,956
SAMN02444912	25120000	stem (Manihot esculenta, SAMN02444912)	20,506,512	56%	18%	125,585
SAMN02444913	25120000	stem (Manihot esculenta, SAMN02444913)	15,164,550	83%	19%	127,799
SAMN02444914	25120000	stem (Manihot esculenta, SAMN02444914)	32,761,886	48%	20%	132,014
SAMN02444915	25120000	stem (Manihot esculenta, SAMN02444915)	16,091,954	61%	18%	120,389
SAMN02444916	25120000	stem (Manihot esculenta, SAMN02444916)	13,776,232	83%	19%	126,355
SAMN02444917	25120000	stem (Manihot esculenta, SAMN02444917)	14,924,866	75%	19%	125,628
SAMN02444918	25120000	stem (Manihot esculenta, SAMN02444918)	7,804,156	22%	19%	66,591
SAMN02444919	25120000	stem (Manihot esculenta, SAMN02444919)	19,272,674	53%	18%	123,268
SAMN03225035	NA	Leaf (Manihot esculenta, 1 month, SAMN03225035)	1,121,424	41%	78%	8,857
SAMN03225177	NA	Leaf (Manihot esculenta subsp. peruviana, 1 month, SAMN03225177)	1,512,738	30%	46%	18,981
SAMN05208170	28116755	oes (Manihot esculenta, SAMN05208170)	41,867,734	91%	23%	153,378
SAMN05208171	28116755	ram (Manihot esculenta, SAMN05208171)	22,189,088	89%	28%	135,208
SAMN05208172	28116755	leaf (Manihot esculenta, SAMN05208172)	26,468,968	82%	23%	136,619
SAMN05208173	28116755	midvein (Manihot esculenta, SAMN05208173)	18,847,368	80%	22%	135,887
SAMN05208174	28116755	petiole (Manihot esculenta, SAMN05208174)	25,053,278	81%	20%	142,215
SAMN05208175	28116755	lateral_bud (Manihot esculenta, SAMN05208175)	33,101,088	87%	24%	151,740
SAMN05208176	28116755	stem (Manihot esculenta, SAMN05208176)	32,669,760	87%	20%	147,979
SAMN05208177	28116755	storage_root (Manihot esculenta, SAMN05208177)	28,015,652	84%	22%	130,259
SAMN05208178	28116755	fibrous_root (Manihot esculenta, SAMN05208178)	30,303,298	86%	24%	147,358
SAMN05208179	28116755	fec (Manihot esculenta, SAMN05208179)	36,133,598	86%	25%	152,584
SAMN05208180	28116755	oes (Manihot esculenta, SAMN05208180)	32,136,026	88%	24%	148,471
SAMN05208181	28116755	sam (Manihot esculenta, SAMN05208181)	17,628,586	83%	22%	134,980
SAMN05208182	28116755	ram (Manihot esculenta, SAMN05208182)	24,790,804	87%	29%	136,868
SAMN05208183	28116755	leaf (Manihot esculenta, SAMN05208183)	38,616,756	87%	23%	145,213
SAMN05208184	28116755	midvein (Manihot esculenta, SAMN05208184)	30,841,568	87%	23%	148,850
SAMN05208185	28116755	petiole (Manihot esculenta, SAMN05208185)	11,246,666	84%	21%	128,651
SAMN05208186	28116755	lateral_bud (Manihot esculenta, SAMN05208186)	28,711,148	90%	23%	150,519
SAMN05208187	28116755	stem (Manihot esculenta, SAMN05208187)	36,284,106	89%	20%	148,668
SAMN05208188	28116755	storage_root (Manihot esculenta, SAMN05208188)	32,521,626	90%	22%	127,031
SAMN05208189	28116755	fibrous_root (Manihot esculenta, SAMN05208189)	27,396,130	90%	25%	147,875
SAMN05208190	28116755	fec (Manihot esculenta, SAMN05208190)	37,716,286	90%	26%	154,235
SAMN05208191	28116755	oes (Manihot esculenta, SAMN05208191)	33,294,020	91%	26%	149,004
SAMN05208192	28116755	sam (Manihot esculenta, SAMN05208192)	17,736,008	87%	22%	134,087
SAMN05208193	28116755	ram (Manihot esculenta, SAMN05208193)	29,048,736	88%	26%	141,263
SAMN05208194	28116755	leaf (Manihot esculenta, SAMN05208194)	21,361,852	87%	22%	132,145
SAMN05208195	28116755	midvein (Manihot esculenta, SAMN05208195)	25,650,536	88%	22%	145,513
SAMN05208196	28116755	sam (Manihot esculenta, SAMN05208196)	17,131,430	70%	22%	127,588
SAMN05208197	28116755	petiole (Manihot esculenta, SAMN05208197)	30,707,318	90%	20%	149,331
SAMN05208198	28116755	lateral_bud (Manihot esculenta, SAMN05208198)	33,673,580	89%	23%	156,220
SAMN05208199	28116755	stem (Manihot esculenta, SAMN05208199)	25,615,976	86%	19%	144,137
SAMN05208200	28116755	fibrous_root (Manihot esculenta, SAMN05208200)	24,583,548	85%	24%	147,631
SAMN05208201	28116755	fec (Manihot esculenta, SAMN05208201)	58,625,198	91%	26%	162,186
SAMN05933012	NA	root (Manihot esculenta, SAMN05933012)	25,771,488	71%	26%	120,203
SAMN05933013	NA	root (Manihot esculenta, SAMN05933013)	34,186,514	61%	24%	79,408
SAMN05933014	NA	root (Manihot esculenta, SAMN05933014)	27,633,062	64%	25%	110,164
SAMN05933015	NA	root (Manihot esculenta, SAMN05933015)	29,174,012	69%	25%	103,131
SAMN05933016	NA	root (Manihot esculenta, SAMN05933016)	29,298,282	70%	24%	103,305
SAMN05933017	NA	root (Manihot esculenta, SAMN05933017)	31,697,218	73%	25%	114,570
SAMN07261492	29193665	storage root (Manihot esculenta, SAMN07261492)	36,359,738	93%	27%	123,982
SAMN07261493	29193665	storage root (Manihot esculenta, SAMN07261493)	35,727,884	91%	26%	124,645
SAMN07261550	29193665	storage root (Manihot esculenta, SAMN07261550)	40,560,692	92%	26%	136,908
SAMN07261551	29193665	storage root (Manihot esculenta, SAMN07261551)	27,020,166	93%	26%	130,809
SAMN07261552	29193665	storage root (Manihot esculenta, SAMN07261552)	30,946,486	93%	26%	134,790
SAMN07261553	29193665	storage root (Manihot esculenta, SAMN07261553)	34,366,342	94%	26%	135,203
SAMN07261554	29193665	storage root (Manihot esculenta, SAMN07261554)	33,697,024	93%	26%	132,123
SAMN07261555	29193665	storage root (Manihot esculenta, SAMN07261555)	37,142,480	93%	26%	135,716
SAMN07261556	29193665	storage root (Manihot esculenta, SAMN07261556)	36,127,236	94%	27%	123,579
SAMN08564055	NA	leaf (Manihot esculenta, unexpanded leaf, SAMN08564055)	156,166,088	91%	36%	174,389
SAMN08564056	NA	leaf (Manihot esculenta, fully-expanded leaf, SAMN08564056)	147,826,362	90%	36%	171,594
SAMN08564057	NA	leaf (Manihot esculenta, unexpanded leaf, SAMN08564057)	167,077,730	90%	36%	175,937
SAMN08564058	NA	leaf (Manihot esculenta, fully-expanded leaf, SAMN08564058)	163,468,398	91%	36%	171,768
SAMN09470104	NA	root (Manihot esculenta, SAMN09470104)	39,737,282	88%	39%	149,053
SAMN09470105	NA	root (Manihot esculenta, SAMN09470105)	36,332,152	86%	38%	147,878
SAMN09470106	NA	root (Manihot esculenta, SAMN09470106)	40,358,974	88%	39%	150,915
SAMN09470107	NA	root (Manihot esculenta, SAMN09470107)	33,436,654	88%	38%	147,973
SAMN09470108	NA	root (Manihot esculenta, SAMN09470108)	41,713,642	87%	39%	149,001
SAMN09470109	NA	root (Manihot esculenta, SAMN09470109)	36,280,500	88%	39%	148,089
SAMN09470110	NA	root (Manihot esculenta, SAMN09470110)	47,967,278	87%	38%	150,796
SAMN09470111	NA	root (Manihot esculenta, SAMN09470111)	44,742,854	88%	37%	151,119
SAMN09470112	NA	root (Manihot esculenta, SAMN09470112)	47,993,632	88%	39%	151,776
SAMN10928296	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928296)	53,783,458	89%	37%	165,074
SAMN10928297	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928297)	53,076,862	89%	38%	163,969
SAMN10928298	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928298)	59,044,076	85%	37%	167,817
SAMN10928299	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928299)	54,357,350	85%	36%	165,282
SAMN10928300	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928300)	54,954,720	89%	37%	165,081
SAMN10928301	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928301)	57,013,328	87%	37%	164,491
SAMN10928302	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928302)	45,746,592	85%	36%	162,686
SAMN10928303	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928303)	57,807,330	85%	37%	165,935
SAMN10928304	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928304)	57,824,440	89%	38%	165,689
SAMN10928305	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928305)	59,100,342	87%	37%	166,688
SAMN10928306	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928306)	61,176,816	85%	38%	168,011
SAMN10928307	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928307)	54,520,108	85%	37%	166,719
SAMN10928308	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928308)	51,305,474	88%	38%	163,805
SAMN10928309	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928309)	56,195,990	88%	37%	165,596
SAMN10928310	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928310)	57,805,040	85%	36%	167,160
SAMN10928311	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928311)	49,557,808	85%	37%	165,556
SAMN10928312	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928312)	51,902,006	87%	37%	164,394
SAMN10928313	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928313)	43,959,656	88%	37%	161,886
SAMN10928314	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928314)	52,319,068	86%	35%	165,964
SAMN10928315	NA	Cassava shoot apices and youngest expanded leaf (Manihot esculenta, SAMN10928315)	48,547,092	86%	36%	165,371
SAMN13324141	NA	storage roots of wild-type cassava (Manihot esculenta, SAMN13324141)	44,747,686	89%	32%	139,407
SAMN13324142	NA	storage roots of wild-type cassava (Manihot esculenta, SAMN13324142)	43,460,418	89%	31%	139,206
SAMN13324143	NA	storage roots of MeSSVI-RNAi transgenic cassava (Manihot esculenta, SAMN13324143)	41,941,878	89%	33%	144,934
SAMN13324144	NA	storage roots of MeSSVI-RNAi transgenic cassava (Manihot esculenta, SAMN13324144)	44,112,218	90%	33%	139,680
SAMN15090723	NA	tuber (Manihot esculenta, 240, SAMN15090723)	43,271,144	93%	38%	127,746
SAMN15090724	NA	tuber (Manihot esculenta, 240, SAMN15090724)	52,103,066	93%	38%	127,824
SAMN15090725	NA	tuber (Manihot esculenta, 240, SAMN15090725)	51,326,090	93%	38%	129,990
SAMN15090726	NA	tuber (Manihot esculenta, 240, SAMN15090726)	52,253,824	94%	38%	131,854
SAMN15090727	NA	tuber (Manihot esculenta, 240, SAMN15090727)	41,798,826	93%	38%	125,151
SAMN15090728	NA	tuber (Manihot esculenta, 240, SAMN15090728)	47,361,736	93%	38%	126,649
SAMN15090729	NA	tuber (Manihot esculenta, 240, SAMN15090729)	41,562,090	93%	38%	132,221
SAMN15090730	NA	tuber (Manihot esculenta, 240, SAMN15090730)	39,815,500	92%	38%	126,705
SAMN15090731	NA	tuber (Manihot esculenta, 240, SAMN15090731)	43,363,152	93%	37%	132,863

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
ERR4801903	ERX4671428	ERP124855	SAMEA7532454	80,864,256	46%	28%
ERR4801904	ERX4671429	ERP124855	SAMEA7532454	82,635,048	46%	27%
ERR4801905	ERX4671430	ERP124855	SAMEA7532454	66,151,860	41%	27%
ERR4801906	ERX4671431	ERP124855	SAMEA7532455	66,971,788	45%	26%
ERR4801907	ERX4671432	ERP124855	SAMEA7532455	56,115,432	37%	26%
ERR4801908	ERX4671433	ERP124855	SAMEA7532455	68,256,122	39%	26%
ERR4801909	ERX4671434	ERP124855	SAMEA7532456	66,672,060	38%	24%
ERR4801910	ERX4671435	ERP124855	SAMEA7532456	44,497,336	42%	23%
ERR4801911	ERX4671436	ERP124855	SAMEA7532456	64,361,260	39%	24%
ERR4801912	ERX4671437	ERP124855	SAMEA7532457	75,602,802	43%	27%
ERR4801913	ERX4671438	ERP124855	SAMEA7532457	41,945,344	44%	27%
ERR4801914	ERX4671439	ERP124855	SAMEA7532457	63,223,862	41%	27%
ERR4801915	ERX4671440	ERP124855	SAMEA7532458	65,664,744	43%	26%
ERR4801916	ERX4671441	ERP124855	SAMEA7532458	43,837,292	43%	26%
ERR4801917	ERX4671442	ERP124855	SAMEA7532458	48,995,270	42%	24%
ERR4801918	ERX4671443	ERP124855	SAMEA7532459	115,044,668	23%	26%
ERR4801919	ERX4671444	ERP124855	SAMEA7532459	119,846,738	22%	24%
ERR4801920	ERX4671445	ERP124855	SAMEA7532459	60,237,300	40%	25%
ERR4801921	ERX4671446	ERP124855	SAMEA7532460	66,513,178	39%	29%
ERR4801922	ERX4671447	ERP124855	SAMEA7532460	91,457,256	25%	26%
ERR4801923	ERX4671448	ERP124855	SAMEA7532460	63,082,842	37%	26%
ERR4801924	ERX4671449	ERP124855	SAMEA7532461	102,245,952	33%	27%
ERR4801925	ERX4671450	ERP124855	SAMEA7532461	96,333,170	29%	26%
ERR4801926	ERX4671451	ERP124855	SAMEA7532461	91,318,356	30%	26%
ERR4801927	ERX4671452	ERP124855	SAMEA7532462	69,457,328	36%	25%
ERR4801928	ERX4671453	ERP124855	SAMEA7532462	73,504,674	41%	23%
ERR4801929	ERX4671454	ERP124855	SAMEA7532462	59,602,790	33%	24%
ERR4801930	ERX4671455	ERP124855	SAMEA7532463	80,980,946	25%	26%
ERR4801931	ERX4671456	ERP124855	SAMEA7532463	80,693,968	24%	26%
ERR4801932	ERX4671457	ERP124855	SAMEA7532463	43,463,660	25%	25%
ERR4801933	ERX4671458	ERP124855	SAMEA7532464	63,414,264	22%	26%
ERR4801934	ERX4671459	ERP124855	SAMEA7532464	84,405,906	27%	27%
ERR4801935	ERX4671460	ERP124855	SAMEA7532464	87,847,894	22%	25%
ERR4801936	ERX4671461	ERP124855	SAMEA7532465	90,487,198	23%	25%
ERR4801937	ERX4671462	ERP124855	SAMEA7532465	72,623,582	25%	26%
ERR4801938	ERX4671463	ERP124855	SAMEA7532465	95,853,304	26%	25%
SRR955444	SRX333038	SRP028613	SAMN02263987	205,405	77%	56%
SRR955445	SRX333040	SRP028613	SAMN02263988	346,000	82%	58%
SRR955446	SRX333041	SRP028613	SAMN02263989	208,365	79%	56%
SRR955447	SRX333042	SRP028613	SAMN02263990	291,438	78%	54%
SRR955448	SRX333043	SRP028613	SAMN02263991	295,774	79%	58%
SRR955449	SRX333044	SRP028613	SAMN02263992	261,139	79%	57%
SRR955450	SRX333045	SRP028613	SAMN02263993	190,325	81%	57%
SRR955451	SRX333046	SRP028613	SAMN02263994	299,198	80%	59%
SRR955453	SRX333048	SRP028613	SAMN02263996	50,550	74%	53%
SRR955456	SRX333051	SRP028613	SAMN02263999	232,379	31%	38%
SRR1050894	SRX392772	SRP034526	SAMN02444908	34,253,318	52%	20%
SRR1050893	SRX392770	SRP034526	SAMN02444909	17,083,990	76%	19%
SRR1050891	SRX392768	SRP034526	SAMN02444910	17,570,286	64%	18%
SRR1050895	SRX392773	SRP034526	SAMN02444911	7,519,304	26%	19%
SRR1050892	SRX392769	SRP034526	SAMN02444912	20,506,512	56%	18%
SRR1050896	SRX392774	SRP034526	SAMN02444913	15,164,550	83%	19%
SRR1050900	SRX392777	SRP034526	SAMN02444914	32,761,886	48%	20%
SRR1050897	SRX392771	SRP034526	SAMN02444915	16,091,954	61%	18%
SRR1050902	SRX392779	SRP034526	SAMN02444916	13,776,232	83%	19%
SRR1050899	SRX392776	SRP034526	SAMN02444917	14,924,866	75%	19%
SRR1050901	SRX392778	SRP034526	SAMN02444918	7,804,156	22%	19%
SRR1050898	SRX392775	SRP034526	SAMN02444919	19,272,674	53%	18%
SRR1664784	SRX769758	SRP050325	SAMN03225035	1,121,424	41%	78%
SRR1664785	SRX769759	SRP050325	SAMN03225177	686,011	39%	50%
SRR1664786	SRX769759	SRP050325	SAMN03225177	826,727	23%	41%
SRR3629864	SRX1821753	SRP076160	SAMN05208170	41,867,734	91%	23%
SRR3629867	SRX1821755	SRP076160	SAMN05208171	22,189,088	89%	28%
SRR3629818	SRX1821724	SRP076160	SAMN05208172	26,468,968	82%	23%
SRR3629819	SRX1821725	SRP076160	SAMN05208173	18,847,368	80%	22%
SRR3629821	SRX1821726	SRP076160	SAMN05208174	25,053,278	81%	20%
SRR3629822	SRX1821727	SRP076160	SAMN05208175	33,101,088	87%	24%
SRR3629824	SRX1821728	SRP076160	SAMN05208176	32,669,760	87%	20%
SRR3629825	SRX1821729	SRP076160	SAMN05208177	28,015,652	84%	22%
SRR3629827	SRX1821730	SRP076160	SAMN05208178	30,303,298	86%	24%
SRR3629829	SRX1821731	SRP076160	SAMN05208179	36,133,598	86%	25%
SRR3629830	SRX1821732	SRP076160	SAMN05208180	32,136,026	88%	24%
SRR3629832	SRX1821733	SRP076160	SAMN05208181	17,628,586	83%	22%
SRR3629834	SRX1821734	SRP076160	SAMN05208182	24,790,804	87%	29%
SRR3629835	SRX1821735	SRP076160	SAMN05208183	38,616,756	87%	23%
SRR3629837	SRX1821736	SRP076160	SAMN05208184	30,841,568	87%	23%
SRR3629838	SRX1821737	SRP076160	SAMN05208185	11,246,666	84%	21%
SRR3629840	SRX1821738	SRP076160	SAMN05208186	28,711,148	90%	23%
SRR3629842	SRX1821739	SRP076160	SAMN05208187	36,284,106	89%	20%
SRR3629843	SRX1821740	SRP076160	SAMN05208188	32,521,626	90%	22%
SRR3629845	SRX1821741	SRP076160	SAMN05208189	27,396,130	90%	25%
SRR3629847	SRX1821742	SRP076160	SAMN05208190	37,716,286	90%	26%
SRR3629848	SRX1821743	SRP076160	SAMN05208191	33,294,020	91%	26%
SRR3629850	SRX1821744	SRP076160	SAMN05208192	17,736,008	87%	22%
SRR3629851	SRX1821745	SRP076160	SAMN05208193	29,048,736	88%	26%
SRR3629853	SRX1821746	SRP076160	SAMN05208194	21,361,852	87%	22%
SRR3629854	SRX1821747	SRP076160	SAMN05208195	25,650,536	88%	22%
SRR3629865	SRX1821754	SRP076160	SAMN05208196	17,131,430	70%	22%
SRR3629856	SRX1821748	SRP076160	SAMN05208197	30,707,318	90%	20%
SRR3629858	SRX1821749	SRP076160	SAMN05208198	33,673,580	89%	23%
SRR3629859	SRX1821750	SRP076160	SAMN05208199	25,615,976	86%	19%
SRR3629861	SRX1821751	SRP076160	SAMN05208200	24,583,548	85%	24%
SRR3629862	SRX1821752	SRP076160	SAMN05208201	58,625,198	91%	26%
SRR4444985	SRX2264031	SRP091945	SAMN05933012	25,771,488	71%	26%
SRR4444984	SRX2264030	SRP091945	SAMN05933013	34,186,514	61%	24%
SRR4444987	SRX2264033	SRP091945	SAMN05933014	27,633,062	64%	25%
SRR4444986	SRX2264032	SRP091945	SAMN05933015	29,174,012	69%	25%
SRR4444989	SRX2264035	SRP091945	SAMN05933016	29,298,282	70%	24%
SRR4444988	SRX2264034	SRP091945	SAMN05933017	31,697,218	73%	25%
SRR5725621	SRX2941559	SRP110033	SAMN07261492	36,359,738	93%	27%
SRR5725620	SRX2941558	SRP110033	SAMN07261493	35,727,884	91%	26%
SRR5725628	SRX2941566	SRP110033	SAMN07261550	40,560,692	92%	26%
SRR5725627	SRX2941565	SRP110033	SAMN07261551	27,020,166	93%	26%
SRR5725626	SRX2941564	SRP110033	SAMN07261552	30,946,486	93%	26%
SRR5725625	SRX2941563	SRP110033	SAMN07261553	34,366,342	94%	26%
SRR5725624	SRX2941562	SRP110033	SAMN07261554	33,697,024	93%	26%
SRR5725623	SRX2941561	SRP110033	SAMN07261555	37,142,480	93%	26%
SRR5725622	SRX2941560	SRP110033	SAMN07261556	36,127,236	94%	27%
SRR6748431	SRX3721110	SRP133051	SAMN08564055	51,445,072	91%	36%
SRR6748430	SRX3721111	SRP133051	SAMN08564055	43,974,612	90%	36%
SRR6748428	SRX3721113	SRP133051	SAMN08564055	60,746,404	90%	36%
SRR6748435	SRX3721106	SRP133051	SAMN08564056	50,775,752	91%	36%
SRR6748434	SRX3721107	SRP133051	SAMN08564056	48,370,928	90%	35%
SRR6748429	SRX3721112	SRP133051	SAMN08564056	48,679,682	91%	35%
SRR6748436	SRX3721105	SRP133051	SAMN08564057	60,219,014	89%	36%
SRR6748433	SRX3721108	SRP133051	SAMN08564057	48,510,866	90%	36%
SRR6748432	SRX3721109	SRP133051	SAMN08564057	58,347,850	91%	37%
SRR6748437	SRX3721104	SRP133051	SAMN08564058	56,257,756	90%	35%
SRR6748427	SRX3721114	SRP133051	SAMN08564058	56,156,762	91%	36%
SRR6748426	SRX3721115	SRP133051	SAMN08564058	51,053,880	91%	36%
SRR7469569	SRX4339333	SRP151951	SAMN09470104	39,737,282	88%	39%
SRR7469568	SRX4339334	SRP151951	SAMN09470105	36,332,152	86%	38%
SRR7469571	SRX4339331	SRP151951	SAMN09470106	40,358,974	88%	39%
SRR7469570	SRX4339332	SRP151951	SAMN09470107	33,436,654	88%	38%
SRR7469565	SRX4339337	SRP151951	SAMN09470108	41,713,642	87%	39%
SRR7469564	SRX4339338	SRP151951	SAMN09470109	36,280,500	88%	39%
SRR7469567	SRX4339335	SRP151951	SAMN09470110	47,967,278	87%	38%
SRR7469566	SRX4339336	SRP151951	SAMN09470111	44,742,854	88%	37%
SRR7469563	SRX4339339	SRP151951	SAMN09470112	47,993,632	88%	39%
SRR8573592	SRX5374575	SRP185866	SAMN10928296	53,783,458	89%	37%
SRR8573593	SRX5374574	SRP185866	SAMN10928297	53,076,862	89%	38%
SRR8573607	SRX5374560	SRP185866	SAMN10928298	59,044,076	85%	37%
SRR8573601	SRX5374566	SRP185866	SAMN10928299	54,357,350	85%	36%
SRR8573598	SRX5374569	SRP185866	SAMN10928300	54,954,720	89%	37%
SRR8573605	SRX5374562	SRP185866	SAMN10928301	57,013,328	87%	37%
SRR8573602	SRX5374565	SRP185866	SAMN10928302	45,746,592	85%	36%
SRR8573603	SRX5374564	SRP185866	SAMN10928303	57,807,330	85%	37%
SRR8573599	SRX5374568	SRP185866	SAMN10928304	57,824,440	89%	38%
SRR8573600	SRX5374567	SRP185866	SAMN10928305	59,100,342	87%	37%
SRR8573588	SRX5374579	SRP185866	SAMN10928306	61,176,816	85%	38%
SRR8573589	SRX5374578	SRP185866	SAMN10928307	54,520,108	85%	37%
SRR8573590	SRX5374577	SRP185866	SAMN10928308	51,305,474	88%	38%
SRR8573591	SRX5374576	SRP185866	SAMN10928309	56,195,990	88%	37%
SRR8573606	SRX5374561	SRP185866	SAMN10928310	57,805,040	85%	36%
SRR8573604	SRX5374563	SRP185866	SAMN10928311	49,557,808	85%	37%
SRR8573594	SRX5374573	SRP185866	SAMN10928312	51,902,006	87%	37%
SRR8573595	SRX5374572	SRP185866	SAMN10928313	43,959,656	88%	37%
SRR8573596	SRX5374571	SRP185866	SAMN10928314	52,319,068	86%	35%
SRR8573597	SRX5374570	SRP185866	SAMN10928315	48,547,092	86%	36%
SRR10503231	SRX7192185	SRP230817	SAMN13324141	44,747,686	89%	32%
SRR10503230	SRX7192186	SRP230817	SAMN13324142	43,460,418	89%	31%
SRR10503229	SRX7192187	SRP230817	SAMN13324143	41,941,878	89%	33%
SRR10503228	SRX7192188	SRP230817	SAMN13324144	44,112,218	90%	33%
SRR11914025	SRX8460574	SRP265734	SAMN15090723	43,271,144	93%	38%
SRR11914024	SRX8460575	SRP265734	SAMN15090724	52,103,066	93%	38%
SRR11914023	SRX8460576	SRP265734	SAMN15090725	51,326,090	93%	38%
SRR11914022	SRX8460577	SRP265734	SAMN15090726	52,253,824	94%	38%
SRR11914021	SRX8460578	SRP265734	SAMN15090727	41,798,826	93%	38%
SRR11914020	SRX8460579	SRP265734	SAMN15090728	47,361,736	93%	38%
SRR11914019	SRX8460580	SRP265734	SAMN15090729	41,562,090	93%	38%
SRR11914018	SRX8460581	SRP265734	SAMN15090730	39,815,500	92%	38%
SRR11914017	SRX8460582	SRP265734	SAMN15090731	43,363,152	93%	37%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Jatropha curcas high-quality model RefSeq (XP_)	11,976	11,626 (97.08%)	11,626 (97.08%)	72.88%	85.02%
Crotonoideae GenBank	1,791	1,375 (76.77%)	1,375 (76.77%)	76.29%	88.38%
Crotonoideae known RefSeq (NP_)	216	210 (97.22%)	210 (97.22%)	74.99%	86.42%
Cucumis melo high-quality model RefSeq (XP_)	11,885	11,563 (97.29%)	11,563 (97.29%)	69.80%	80.28%
Cucumis melo known RefSeq (NP_)	131	124 (94.66%)	124 (94.66%)	70.61%	86.40%
Arabidopsis thaliana known RefSeq (NP_)	48,148	35,299 (73.31%)	35,299 (73.31%)	67.53%	74.77%
Glycine max high-quality model RefSeq (XP_)	23,544	22,445 (95.33%)	22,445 (95.33%)	68.95%	79.24%
Glycine max known RefSeq (NP_)	7,942	7,471 (94.07%)	7,471 (94.07%)	71.21%	80.86%
Same-species GenBank	681	616 (90.46%)	616 (90.46%)	77.54%	87.11%
Eucalyptus grandis high-quality model RefSeq (XP_)	14,630	13,976 (95.53%)	13,976 (95.53%)	69.11%	80.46%
Eucalyptus grandis known RefSeq (NP_)	37	36 (97.30%)	36 (97.30%)	74.78%	82.18%
Populus euphratica high-quality model RefSeq (XP_)	18,422	17,336 (94.10%)	17,336 (94.10%)	71.61%	83.55%
Populus euphratica known RefSeq (NP_)	39	39 (100.00%)	39 (100.00%)	68.28%	81.87%

Assembly-assembly alignments of current to previous assembly

When the assembly changes between two rounds of annotation, genes in the current and the previous annotation are mapped to each other using the genomic alignments of the current assembly to the previous assembly so that gene identifiers can be preserved. The success of the remapping depends largely on how well the two assembly versions align to each other.

Below are the percent coverage of one assembly by the other and the average percent identity of the alignments. The 'First pass' alignments are reciprocal best hits, while the 'Total' alignments also include 'Second pass' or non-reciprocal best alignments. For more information about the assembly-assembly alignment process, please visit the NCBI Genome Remapping Service page.

First Pass	Total
M.esculenta_v8 (Current) Coverage: 73.48%	M.esculenta_v8 (Current) Coverage: 76.69%
Manihot esculenta v6 (Previous) Coverage: 94.57%	Manihot esculenta v6 (Previous) Coverage: 95.18%
Percent Identity: 99.90%	Percent Identity: 99.62%

Comparison of the current and previous annotations

The annotation produced for this release (101) was compared to the annotation in the previous release (100) for each assembly annotated in both releases. Scores for current and previous gene and transcript features were calculated based on overlap in exon sequence and matches in exon boundaries. Pairs of current and previous features were categorized based on these scores, whether they are reciprocal best matches, and changes in attributes (gene biotype, completeness, etc.). If the assembly was updated between the two releases, alignments between the current and the previous assembly were used to match the current and previous gene and transcript features in mapped regions.

The table below summarizes the changes in the gene set for each assembly as a percent of the number of genes in the current annotation release, and provides links to the details of the comparison in tabular format and in a Genome Workbench project.

	M.esculenta_v8 (Current) to Manihot_esculenta_v6 (Previous)
Identical	8%
Minor changes	71%
Major changes	9%
New	12%
Deprecated	5%
Other	<1%
Download the report	tabular, Genome Workbench

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20
Minimap2: Li H. Bioinformatics 2018 Sep 15;34(18):3094-3100

RefSeq

Integrated reference sequences