NCBI Pieris rapae Annotation Release 101

The RefSeq genome records for Pieris rapae were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
BUSCO results: Annotation completeness assessed with BUSCO
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction
Similarity of current and previous assembly: The similarity of the current and previous assembly
Comparison of the current and previous annotations: What proportion of the genes changed in this annotation

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Pieris rapae Annotation Release 101

Annotation release ID: 101
Date of Entrez queries for transcripts and proteins: Dec 22 2021
Date of submission of annotation to the public databases: Dec 29 2021
Software version: 9.0

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
ilPieRapa1.1	GCF_905147795.1	Wellcome Sanger Institute	01-28-2021	Reference	27 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	ilPieRapa1.1
Genes and pseudogenes	13,630
protein-coding	12,147
non-coding	1,396
Transcribed pseudogenes	0
Non-transcribed pseudogenes	87
genes with variants	3,747
Immunoglobulin/T-cell receptor gene segments	0
other	0
mRNAs	20,484
fully-supported	19,723
with > 5% ab initio	265
partial	46
with filled gap(s)	2
known RefSeq (NM_)	0
model RefSeq (XM_)	20,484
non-coding RNAs	1,744
fully-supported	1,134
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	1,360
pseudo transcripts	0
fully-supported	0
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	0
CDSs	20,497
fully-supported	19,723
with > 5% ab initio	291
partial	46
with major correction(s)	51
known RefSeq (NP_)	0
model RefSeq (XP_)	20,497

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	13,543	12,589	4,689	60	437,627
All transcripts	22,228	3,012	2,143	60	59,946
mRNA	20,484	3,190	2,299	230	59,946
misc_RNA	264	2,816	2,201	226	11,207
tRNA	382	73	73	60	84
lncRNA	872	839	605	74	7,693
snoRNA	16	126	84	69	203
snRNA	57	139	139	73	193
rRNA	153	632	119	115	3,951
Single-exon transcripts	1,043	1,628	1,341	258	11,526
coding transcripts (NM_/XM_ )	1,043	1,628	1,341	258	11,526
CDSs	20,497	2,225	1,446	105	59,397
Exons	109,427	303	164	2	22,622
in coding transcripts (NM_/XM_ )	106,743	303	164	2	22,622
in non-coding transcripts (NR_/XR_ )	4,214	272	155	2	6,957
Introns	95,740	1,798	491	30	299,757
in coding transcripts (NM_/XM_ )	93,978	1,804	491	30	299,757
in non-coding transcripts (NR_/XR_ )	3,211	1,761	476	30	114,448

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	1.66	1	1	50
Number of exons per transcript	10.56	7	1	162

BUSCO analysis of gene annotation

BUSCO v4.1.4 (Simão et al 2015, PMID: 26059717) was run in "protein" mode on the annotated gene set picking one longest protein per gene, and run using the lepidoptera_odb10 lineage dataset. Results are reported for the gene set from the primary assembly unit, and presented in BUSCO notation (C:complete [S:single-copy, D:duplicated], F:fragmented, M:missing, n:number of genes used).

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the Drosophila melanogaster known RefSeq proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 12134 coding genes, 9390 genes had a protein with an alignment covering 50% or more of the query and 3065 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: Drosophila melanogaster known RefSeq proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker (if calculated), for each assembly. RepeatMasker results are only calculated for organisms with complete Dfam HMM model collections.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with WindowMasker
ilPieRapa1.1	GCF_905147795.1	38.47%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign, minimap2, or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Transcript alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by Splign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species Genbank	410	406 (99.02%)	392 (95.61%)	98.74%	99.77%
Same-species EST	286	241 (84.27%)	222 (77.62%)	98.67%	99.39%

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Publication	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	NA	Aggregate of all aligned samples	2,408,632,692	85%	18%	107,158
SAMEA4955073	NA	Pieris rapae larva on Arabidopsis kamchatica (Pieris rapae crucivora, SAMEA4955073)	32,293,992	84%	28%	82,956
SAMEA4955074	NA	Pieris rapae larva on Cardamine occulta (Pieris rapae crucivora, SAMEA4955074)	33,902,960	85%	28%	83,326
SAMEA5364455	NA	whole body RNA extraction (Pieris rapae, female, SAMEA5364455)	82,136,200	87%	18%	89,666
SAMEA5364456	NA	whole body RNA extraction (Pieris rapae, male, SAMEA5364456)	82,696,072	87%	17%	89,745
SAMEA7523264	NA	WHOLE ORGANISM (Pieris rapae, female, SAMEA7523264)	32,677,156	72%	34%	74,351
SAMN03742523	NA	whole body (Pieris rapae, larval I, SAMN03742523)	52,772,096	85%	33%	92,297
SAMN03742524	NA	whole body (Pieris rapae, larval III, SAMN03742524)	53,329,586	86%	39%	87,939
SAMN03742525	NA	whole body (Pieris rapae, larval V, SAMN03742525)	43,518,160	84%	32%	85,429
SAMN03742526	NA	whole body (Pieris rapae, pupa, SAMN03742526)	43,654,136	85%	29%	87,314
SAMN03742527	NA	whole body (Pieris rapae, adult, SAMN03742527)	46,335,948	80%	34%	92,950
SAMN04866670	NA	whole body (Pieris rapae, female, SAMN04866670)	23,785,540	89%	39%	79,916
SAMN06709001	28549454	gut (Pieris rapae, not determined, SAMN06709001)	56,370,666	87%	28%	76,025
SAMN06709002	28549454	gut (Pieris rapae, not determined, SAMN06709002)	46,987,904	87%	29%	69,952
SAMN06709003	28549454	gut (Pieris rapae, female, SAMN06709003)	36,204,280	88%	31%	64,080
SAMN06709004	28549454	gut (Pieris rapae, female, SAMN06709004)	53,367,848	85%	28%	73,717
SAMN06709005	28549454	fat body (Pieris rapae, female, SAMN06709005)	52,320,402	92%	25%	62,149
SAMN06709006	28549454	gut (Pieris rapae, female, SAMN06709006)	53,886,480	88%	32%	65,893
SAMN06709007	28549454	gut (Pieris rapae, female, SAMN06709007)	59,808,256	88%	31%	59,608
SAMN06754559	28549454	gut (Pieris rapae, male, SAMN06754559)	10,738,126	82%	8%	42,997
SAMN06754560	28549454	gut (Pieris rapae, male, SAMN06754560)	12,162,828	84%	10%	41,883
SAMN06754561	28549454	gut (Pieris rapae, male, SAMN06754561)	6,656,742	22%	5%	9,261
SAMN06754562	28549454	gut (Pieris rapae, female, SAMN06754562)	12,036,496	84%	10%	41,479
SAMN06754563	28549454	gut (Pieris rapae, female, SAMN06754563)	9,776,703	80%	10%	17,508
SAMN06754564	28549454	gut (Pieris rapae, female, SAMN06754564)	11,632,303	84%	11%	42,452
SAMN06754565	28549454	gut (Pieris rapae, male, SAMN06754565)	12,776,148	86%	11%	39,609
SAMN06754566	28549454	gut (Pieris rapae, female, SAMN06754566)	10,668,062	84%	11%	34,997
SAMN06754567	28549454	gut (Pieris rapae, male, SAMN06754567)	9,561,424	64%	9%	8,809
SAMN06754568	28549454	gut (Pieris rapae, female, SAMN06754568)	11,965,146	84%	11%	38,214
SAMN06754569	28549454	gut (Pieris rapae, female, SAMN06754569)	14,297,376	85%	11%	36,703
SAMN06754570	28549454	gut (Pieris rapae, female, SAMN06754570)	11,409,940	84%	10%	34,576
SAMN06754571	28549454	gut (Pieris rapae, female, SAMN06754571)	12,497,594	81%	11%	37,341
SAMN06754572	28549454	gut (Pieris rapae, female, SAMN06754572)	12,891,067	85%	11%	38,338
SAMN06754573	28549454	gut (Pieris rapae, female, SAMN06754573)	12,356,179	85%	10%	43,019
SAMN06754574	28549454	gut (Pieris rapae, male, SAMN06754574)	11,792,993	86%	11%	39,796
SAMN06754575	28549454	gut (Pieris rapae, male, SAMN06754575)	11,690,019	84%	9%	39,720
SAMN06754576	28549454	gut (Pieris rapae, female, SAMN06754576)	12,334,876	86%	11%	38,427
SAMN06754577	28549454	gut (Pieris rapae, male, SAMN06754577)	8,458,263	80%	10%	28,007
SAMN06754578	28549454	gut (Pieris rapae, female, SAMN06754578)	12,999,895	85%	10%	39,403
SAMN06754579	28549454	gut (Pieris rapae, male, SAMN06754579)	11,480,488	84%	10%	39,682
SAMN06754580	28549454	gut (Pieris rapae, male, SAMN06754580)	13,442,376	85%	11%	38,923
SAMN06754581	28549454	gut (Pieris rapae, female, SAMN06754581)	13,376,240	83%	10%	38,994
SAMN06754582	28549454	gut (Pieris rapae, female, SAMN06754582)	12,347,155	85%	10%	37,888
SAMN06754583	28549454	gut (Pieris rapae, male, SAMN06754583)	10,990,923	82%	9%	39,614
SAMN06754584	28549454	gut (Pieris rapae, male, SAMN06754584)	11,353,287	82%	9%	34,425
SAMN06754585	28549454	gut (Pieris rapae, female, SAMN06754585)	11,323,569	80%	9%	38,474
SAMN06754586	28549454	gut (Pieris rapae, female, SAMN06754586)	11,488,879	86%	9%	40,046
SAMN06754587	28549454	gut (Pieris rapae, male, SAMN06754587)	11,673,361	83%	9%	47,428
SAMN06754588	28549454	gut (Pieris rapae, female, SAMN06754588)	9,251,618	82%	9%	38,066
SAMN06754589	28549454	gut (Pieris rapae, male, SAMN06754589)	12,491,335	83%	8%	44,554
SAMN06754590	28549454	gut (Pieris rapae, female, SAMN06754590)	13,020,732	85%	11%	41,487
SAMN06754591	28549454	gut (Pieris rapae, male, SAMN06754591)	10,037,557	86%	11%	36,237
SAMN06754592	28549454	gut (Pieris rapae, female, SAMN06754592)	13,239,550	86%	9%	42,381
SAMN06754593	28549454	gut (Pieris rapae, female, SAMN06754593)	12,507,564	84%	10%	36,824
SAMN06754594	28549454	gut (Pieris rapae, male, SAMN06754594)	11,150,878	85%	11%	32,482
SAMN06754595	28549454	gut (Pieris rapae, male, SAMN06754595)	14,211,208	85%	11%	37,399
SAMN06754596	28549454	2nd instar, gut (Pieris rapae, SAMN06754596)	11,136,867	80%	10%	49,975
SAMN06754597	28549454	gut (Pieris rapae, male, SAMN06754597)	10,671,834	85%	11%	36,747
SAMN06754598	28549454	gut (Pieris rapae, male, SAMN06754598)	11,293,627	85%	11%	39,143
SAMN06754599	28549454	gut (Pieris rapae, male, SAMN06754599)	12,808,038	84%	11%	35,433
SAMN06754600	28549454	gut (Pieris rapae, female, SAMN06754600)	12,015,848	84%	10%	40,133
SAMN06754601	28549454	gut (Pieris rapae, female, SAMN06754601)	12,557,854	84%	11%	39,078
SAMN06754602	28549454	gut (Pieris rapae, male, SAMN06754602)	10,887,237	84%	11%	31,818
SAMN06754603	28549454	gut (Pieris rapae, male, SAMN06754603)	10,628,776	84%	11%	36,452
SAMN06754604	28549454	gut (Pieris rapae, female, SAMN06754604)	11,840,435	85%	11%	36,375
SAMN06754605	28549454	gut (Pieris rapae, male, SAMN06754605)	11,518,260	85%	11%	38,696
SAMN06754606	28549454	gut (Pieris rapae, female, SAMN06754606)	11,937,485	87%	7%	43,431
SAMN06754607	28549454	gut (Pieris rapae, male, SAMN06754607)	13,024,420	86%	11%	39,334
SAMN06754608	28549454	gut (Pieris rapae, male, SAMN06754608)	12,514,841	82%	9%	42,085
SAMN06754609	28549454	gut (Pieris rapae, female, SAMN06754609)	11,690,014	87%	9%	37,417
SAMN06754610	28549454	gut (Pieris rapae, female, SAMN06754610)	11,694,203	85%	11%	37,238
SAMN06754611	28549454	gut (Pieris rapae, female, SAMN06754611)	13,047,022	84%	9%	27,028
SAMN06754612	28549454	gut (Pieris rapae, male, SAMN06754612)	13,084,671	84%	11%	37,593
SAMN06754613	28549454	gut (Pieris rapae, male, SAMN06754613)	11,380,539	84%	11%	38,324
SAMN06754614	28549454	gut (Pieris rapae, male, SAMN06754614)	12,470,523	85%	11%	39,909
SAMN06754615	28549454	gut (Pieris rapae, female, SAMN06754615)	11,824,360	85%	11%	38,134
SAMN06754616	28549454	gut (Pieris rapae, male, SAMN06754616)	13,063,164	85%	10%	40,942
SAMN06754617	28549454	gut (Pieris rapae, female, SAMN06754617)	11,019,143	78%	11%	34,898
SAMN06754618	28549454	gut (Pieris rapae, male, SAMN06754618)	11,448,280	84%	11%	37,875
SAMN06754619	28549454	gut (Pieris rapae, female, SAMN06754619)	13,081,351	84%	11%	31,404
SAMN06754620	28549454	2nd instar, gut (Pieris rapae, SAMN06754620)	10,837,371	82%	10%	47,393
SAMN06754621	28549454	gut (Pieris rapae, male, SAMN06754621)	10,337,922	84%	10%	37,287
SAMN06754622	28549454	gut (Pieris rapae, female, SAMN06754622)	10,418,966	84%	10%	29,713
SAMN06754623	28549454	gut (Pieris rapae, female, SAMN06754623)	12,000,842	85%	11%	38,752
SAMN06754624	28549454	gut (Pieris rapae, male, SAMN06754624)	12,824,388	83%	9%	45,909
SAMN06754625	28549454	2nd instar, gut (Pieris rapae, SAMN06754625)	12,063,844	83%	10%	47,112
SAMN06754626	28549454	2nd instar, gut (Pieris rapae, SAMN06754626)	9,336,794	81%	10%	24,615
SAMN06754627	28549454	2nd instar, gut (Pieris rapae, SAMN06754627)	14,603,714	96%	4%	36,742
SAMN06754628	28549454	2nd instar, gut (Pieris rapae, SAMN06754628)	11,699,223	84%	10%	52,546
SAMN06754629	28549454	2nd instar, gut (Pieris rapae, SAMN06754629)	12,997,454	83%	11%	45,552
SAMN06754630	28549454	2nd instar, gut (Pieris rapae, SAMN06754630)	12,187,993	83%	10%	46,779
SAMN06754631	28549454	2nd instar, gut (Pieris rapae, SAMN06754631)	13,943,712	81%	10%	45,721
SAMN06754632	28549454	2nd instar, gut (Pieris rapae, SAMN06754632)	10,135,954	80%	9%	50,539
SAMN06754633	28549454	2nd instar, gut (Pieris rapae, SAMN06754633)	11,732,624	81%	10%	45,592
SAMN06754634	28549454	2nd instar, gut (Pieris rapae, SAMN06754634)	12,317,910	83%	10%	46,578
SAMN06754635	28549454	2nd instar, gut (Pieris rapae, SAMN06754635)	9,799,978	77%	10%	44,904
SAMN06754636	28549454	2nd instar, gut (Pieris rapae, SAMN06754636)	11,079,299	83%	10%	42,777
SAMN06754637	28549454	2nd instar, gut (Pieris rapae, SAMN06754637)	11,997,388	83%	10%	45,227
SAMN06754638	28549454	2nd instar, gut (Pieris rapae, SAMN06754638)	10,797,073	82%	10%	43,639
SAMN06754639	28549454	2nd instar, gut (Pieris rapae, SAMN06754639)	9,707,797	82%	10%	38,865
SAMN06754640	28549454	2nd instar, gut (Pieris rapae, SAMN06754640)	8,756,378	82%	10%	39,747
SAMN06754641	28549454	2nd instar, gut (Pieris rapae, SAMN06754641)	12,678,001	81%	10%	39,552
SAMN06754642	28549454	2nd instar, gut (Pieris rapae, SAMN06754642)	11,097,187	82%	10%	42,841
SAMN06754643	28549454	2nd instar, gut (Pieris rapae, SAMN06754643)	10,820,574	84%	10%	44,745
SAMN06754644	28549454	2nd instar, gut (Pieris rapae, SAMN06754644)	12,831,204	81%	10%	41,181
SAMN06754645	28549454	2nd instar, gut (Pieris rapae, SAMN06754645)	12,677,197	82%	10%	43,410
SAMN06754646	28549454	2nd instar, gut (Pieris rapae, SAMN06754646)	11,267,566	81%	10%	46,703
SAMN06754647	28549454	2nd instar, gut (Pieris rapae, SAMN06754647)	10,838,186	80%	10%	38,879
SAMN06754648	28549454	2nd instar, gut (Pieris rapae, SAMN06754648)	9,115,331	77%	10%	40,980
SAMN06754649	28549454	2nd instar, gut (Pieris rapae, SAMN06754649)	11,737,636	80%	10%	53,922
SAMN06754650	28549454	2nd instar, gut (Pieris rapae, SAMN06754650)	11,473,085	81%	10%	32,795
SAMN06754651	28549454	2nd instar, gut (Pieris rapae, SAMN06754651)	11,602,581	83%	10%	46,761
SAMN06754652	28549454	2nd instar, gut (Pieris rapae, SAMN06754652)	12,915,030	83%	11%	47,174
SAMN06754653	28549454	2nd instar, gut (Pieris rapae, SAMN06754653)	10,775,140	82%	10%	41,798
SAMN06754654	28549454	2nd instar, gut (Pieris rapae, SAMN06754654)	10,128,368	84%	10%	43,090
SAMN07167856	NA	Intestinal tissue (Pieris rapae, female, SAMN07167856)	57,836,792	83%	22%	76,751
SAMN08648809	30076351	egg (Pieris rapae, SAMN08648809)	137,633,572	77%	11%	90,528
SAMN08648810	30076351	larvae (Pieris rapae, SAMN08648810)	237,771,640	82%	12%	93,927

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
ERR2822750	ERX2829494	ERP111316	SAMEA4955073	32,293,992	84%	28%
ERR2822751	ERX2829495	ERP111316	SAMEA4955074	33,902,960	85%	28%
ERR3169839	ERX3197787	ERP113912	SAMEA5364455	10,979,990	87%	18%
ERR3169840	ERX3197788	ERP113912	SAMEA5364455	11,590,432	87%	18%
ERR3169841	ERX3197789	ERP113912	SAMEA5364455	12,007,102	87%	18%
ERR3169842	ERX3197790	ERP113912	SAMEA5364455	11,703,304	87%	18%
ERR3169843	ERX3197791	ERP113912	SAMEA5364455	11,811,548	87%	17%
ERR3169844	ERX3197792	ERP113912	SAMEA5364455	11,860,354	87%	17%
ERR3169845	ERX3197793	ERP113912	SAMEA5364455	12,183,470	87%	17%
ERR3169846	ERX3197794	ERP113912	SAMEA5364456	11,039,000	87%	17%
ERR3169847	ERX3197795	ERP113912	SAMEA5364456	11,662,078	87%	17%
ERR3169848	ERX3197796	ERP113912	SAMEA5364456	12,084,816	87%	17%
ERR3169849	ERX3197797	ERP113912	SAMEA5364456	11,713,040	87%	17%
ERR3169850	ERX3197798	ERP113912	SAMEA5364456	11,870,384	87%	17%
ERR3169851	ERX3197799	ERP113912	SAMEA5364456	11,984,966	87%	17%
ERR3169852	ERX3197800	ERP113912	SAMEA5364456	12,341,788	87%	17%
ERR6286709	ERX5921397	ERP125973	SAMEA7523264	32,677,156	72%	34%
SRR2048605	SRX1046722	SRP059012	SAMN03742523	52,772,096	85%	33%
SRR2048607	SRX1046723	SRP059012	SAMN03742524	53,329,586	86%	39%
SRR2048608	SRX1046724	SRP059012	SAMN03742525	43,518,160	84%	32%
SRR2048609	SRX1046725	SRP059012	SAMN03742526	43,654,136	85%	29%
SRR2048610	SRX1046726	SRP059012	SAMN03742527	46,335,948	80%	34%
SRR4339885	SRX2207223	SRP073457	SAMN04866670	23,785,540	89%	39%
SRR5438367	SRX2728309	SRP103670	SAMN06709001	56,370,666	87%	28%
SRR5438366	SRX2728308	SRP103670	SAMN06709002	46,987,904	87%	29%
SRR5438365	SRX2728307	SRP103670	SAMN06709003	36,204,280	88%	31%
SRR5438364	SRX2728306	SRP103670	SAMN06709004	53,367,848	85%	28%
SRR5438363	SRX2728305	SRP103670	SAMN06709005	52,320,402	92%	25%
SRR5438362	SRX2728304	SRP103670	SAMN06709006	53,886,480	88%	32%
SRR5438361	SRX2728303	SRP103670	SAMN06709007	59,808,256	88%	31%
SRR5447587	SRX2736552	SRP103908	SAMN06754559	10,738,126	82%	8%
SRR5447586	SRX2736551	SRP103908	SAMN06754560	12,162,828	84%	10%
SRR5447585	SRX2736550	SRP103908	SAMN06754561	6,656,742	22%	5%
SRR5447584	SRX2736549	SRP103908	SAMN06754562	12,036,496	84%	10%
SRR5447583	SRX2736548	SRP103908	SAMN06754563	9,776,703	80%	10%
SRR5447582	SRX2736547	SRP103908	SAMN06754564	11,632,303	84%	11%
SRR5447581	SRX2736546	SRP103908	SAMN06754565	12,776,148	86%	11%
SRR5447580	SRX2736545	SRP103908	SAMN06754566	10,668,062	84%	11%
SRR5447579	SRX2736544	SRP103908	SAMN06754567	9,561,424	64%	9%
SRR5447578	SRX2736543	SRP103908	SAMN06754568	11,965,146	84%	11%
SRR5447577	SRX2736542	SRP103908	SAMN06754569	14,297,376	85%	11%
SRR5447576	SRX2736541	SRP103908	SAMN06754570	11,409,940	84%	10%
SRR5447575	SRX2736540	SRP103908	SAMN06754571	12,497,594	81%	11%
SRR5447574	SRX2736539	SRP103908	SAMN06754572	12,891,067	85%	11%
SRR5447573	SRX2736538	SRP103908	SAMN06754573	12,356,179	85%	10%
SRR5447572	SRX2736537	SRP103908	SAMN06754574	11,792,993	86%	11%
SRR5447571	SRX2736536	SRP103908	SAMN06754575	11,690,019	84%	9%
SRR5447570	SRX2736535	SRP103908	SAMN06754576	12,334,876	86%	11%
SRR5447569	SRX2736534	SRP103908	SAMN06754577	8,458,263	80%	10%
SRR5447568	SRX2736533	SRP103908	SAMN06754578	12,999,895	85%	10%
SRR5447567	SRX2736532	SRP103908	SAMN06754579	11,480,488	84%	10%
SRR5447566	SRX2736531	SRP103908	SAMN06754580	13,442,376	85%	11%
SRR5447565	SRX2736530	SRP103908	SAMN06754581	13,376,240	83%	10%
SRR5447564	SRX2736529	SRP103908	SAMN06754582	12,347,155	85%	10%
SRR5447563	SRX2736528	SRP103908	SAMN06754583	10,990,923	82%	9%
SRR5447562	SRX2736527	SRP103908	SAMN06754584	11,353,287	82%	9%
SRR5447561	SRX2736526	SRP103908	SAMN06754585	11,323,569	80%	9%
SRR5447560	SRX2736525	SRP103908	SAMN06754586	11,488,879	86%	9%
SRR5447559	SRX2736524	SRP103908	SAMN06754587	11,673,361	83%	9%
SRR5447558	SRX2736523	SRP103908	SAMN06754588	9,251,618	82%	9%
SRR5447557	SRX2736522	SRP103908	SAMN06754589	12,491,335	83%	8%
SRR5447556	SRX2736521	SRP103908	SAMN06754590	13,020,732	85%	11%
SRR5447555	SRX2736520	SRP103908	SAMN06754591	10,037,557	86%	11%
SRR5447554	SRX2736519	SRP103908	SAMN06754592	13,239,550	86%	9%
SRR5447553	SRX2736518	SRP103908	SAMN06754593	12,507,564	84%	10%
SRR5447552	SRX2736517	SRP103908	SAMN06754594	11,150,878	85%	11%
SRR5447551	SRX2736516	SRP103908	SAMN06754595	14,211,208	85%	11%
SRR5447550	SRX2736515	SRP103908	SAMN06754596	11,136,867	80%	10%
SRR5447549	SRX2736514	SRP103908	SAMN06754597	10,671,834	85%	11%
SRR5447548	SRX2736513	SRP103908	SAMN06754598	11,293,627	85%	11%
SRR5447547	SRX2736512	SRP103908	SAMN06754599	12,808,038	84%	11%
SRR5447546	SRX2736511	SRP103908	SAMN06754600	12,015,848	84%	10%
SRR5447545	SRX2736510	SRP103908	SAMN06754601	12,557,854	84%	11%
SRR5447544	SRX2736509	SRP103908	SAMN06754602	10,887,237	84%	11%
SRR5447543	SRX2736508	SRP103908	SAMN06754603	10,628,776	84%	11%
SRR5447542	SRX2736507	SRP103908	SAMN06754604	11,840,435	85%	11%
SRR5447541	SRX2736506	SRP103908	SAMN06754605	11,518,260	85%	11%
SRR5447540	SRX2736505	SRP103908	SAMN06754606	11,937,485	87%	7%
SRR5447539	SRX2736504	SRP103908	SAMN06754607	13,024,420	86%	11%
SRR5447538	SRX2736503	SRP103908	SAMN06754608	12,514,841	82%	9%
SRR5447537	SRX2736502	SRP103908	SAMN06754609	11,690,014	87%	9%
SRR5447536	SRX2736501	SRP103908	SAMN06754610	11,694,203	85%	11%
SRR5447535	SRX2736500	SRP103908	SAMN06754611	13,047,022	84%	9%
SRR5447534	SRX2736499	SRP103908	SAMN06754612	13,084,671	84%	11%
SRR5447533	SRX2736498	SRP103908	SAMN06754613	11,380,539	84%	11%
SRR5447532	SRX2736497	SRP103908	SAMN06754614	12,470,523	85%	11%
SRR5447531	SRX2736496	SRP103908	SAMN06754615	11,824,360	85%	11%
SRR5447530	SRX2736495	SRP103908	SAMN06754616	13,063,164	85%	10%
SRR5447529	SRX2736494	SRP103908	SAMN06754617	11,019,143	78%	11%
SRR5447528	SRX2736493	SRP103908	SAMN06754618	11,448,280	84%	11%
SRR5447527	SRX2736492	SRP103908	SAMN06754619	13,081,351	84%	11%
SRR5447526	SRX2736491	SRP103908	SAMN06754620	10,837,371	82%	10%
SRR5447525	SRX2736490	SRP103908	SAMN06754621	10,337,922	84%	10%
SRR5447524	SRX2736489	SRP103908	SAMN06754622	10,418,966	84%	10%
SRR5447523	SRX2736488	SRP103908	SAMN06754623	12,000,842	85%	11%
SRR5447522	SRX2736487	SRP103908	SAMN06754624	12,824,388	83%	9%
SRR5447521	SRX2736486	SRP103908	SAMN06754625	12,063,844	83%	10%
SRR5447520	SRX2736485	SRP103908	SAMN06754626	9,336,794	81%	10%
SRR5447519	SRX2736484	SRP103908	SAMN06754627	14,603,714	96%	4%
SRR5447518	SRX2736483	SRP103908	SAMN06754628	11,699,223	84%	10%
SRR5447517	SRX2736482	SRP103908	SAMN06754629	12,997,454	83%	11%
SRR5447516	SRX2736481	SRP103908	SAMN06754630	12,187,993	83%	10%
SRR5447515	SRX2736480	SRP103908	SAMN06754631	13,943,712	81%	10%
SRR5447514	SRX2736479	SRP103908	SAMN06754632	10,135,954	80%	9%
SRR5447513	SRX2736478	SRP103908	SAMN06754633	11,732,624	81%	10%
SRR5447512	SRX2736477	SRP103908	SAMN06754634	12,317,910	83%	10%
SRR5447511	SRX2736476	SRP103908	SAMN06754635	9,799,978	77%	10%
SRR5447510	SRX2736475	SRP103908	SAMN06754636	11,079,299	83%	10%
SRR5447509	SRX2736474	SRP103908	SAMN06754637	11,997,388	83%	10%
SRR5447508	SRX2736473	SRP103908	SAMN06754638	10,797,073	82%	10%
SRR5447507	SRX2736472	SRP103908	SAMN06754639	9,707,797	82%	10%
SRR5447506	SRX2736471	SRP103908	SAMN06754640	8,756,378	82%	10%
SRR5447505	SRX2736470	SRP103908	SAMN06754641	12,678,001	81%	10%
SRR5447504	SRX2736469	SRP103908	SAMN06754642	11,097,187	82%	10%
SRR5447503	SRX2736468	SRP103908	SAMN06754643	10,820,574	84%	10%
SRR5447502	SRX2736467	SRP103908	SAMN06754644	12,831,204	81%	10%
SRR5447501	SRX2736466	SRP103908	SAMN06754645	12,677,197	82%	10%
SRR5447500	SRX2736465	SRP103908	SAMN06754646	11,267,566	81%	10%
SRR5447499	SRX2736464	SRP103908	SAMN06754647	10,838,186	80%	10%
SRR5447498	SRX2736463	SRP103908	SAMN06754648	9,115,331	77%	10%
SRR5447497	SRX2736462	SRP103908	SAMN06754649	11,737,636	80%	10%
SRR5447496	SRX2736461	SRP103908	SAMN06754650	11,473,085	81%	10%
SRR5447495	SRX2736460	SRP103908	SAMN06754651	11,602,581	83%	10%
SRR5447494	SRX2736459	SRP103908	SAMN06754652	12,915,030	83%	11%
SRR5447493	SRX2736458	SRP103908	SAMN06754653	10,775,140	82%	10%
SRR5447492	SRX2736457	SRP103908	SAMN06754654	10,128,368	84%	10%
SRR5683164	SRX2881301	SRP108106	SAMN07167856	57,836,792	83%	22%
SRR6818644	SRX3775611	SRP134094	SAMN08648809	51,284,958	78%	18%
SRR6818627	SRX3775628	SRP134094	SAMN08648809	22,615,848	81%	7%
SRR6818626	SRX3775629	SRP134094	SAMN08648809	26,189,771	73%	7%
SRR6818625	SRX3775630	SRP134094	SAMN08648809	17,549,302	79%	7%
SRR6818624	SRX3775631	SRP134094	SAMN08648809	19,993,693	73%	7%
SRR6818660	SRX3775595	SRP134094	SAMN08648810	17,132,971	81%	9%
SRR6818648	SRX3775607	SRP134094	SAMN08648810	15,970,738	82%	9%
SRR6818647	SRX3775608	SRP134094	SAMN08648810	17,841,901	83%	9%
SRR6818646	SRX3775609	SRP134094	SAMN08648810	32,846,908	83%	9%
SRR6818645	SRX3775610	SRP134094	SAMN08648810	27,405,885	81%	9%
SRR6818643	SRX3775612	SRP134094	SAMN08648810	19,307,208	82%	9%
SRR6818641	SRX3775614	SRP134094	SAMN08648810	41,731,336	84%	24%
SRR6818632	SRX3775623	SRP134094	SAMN08648810	23,015,788	83%	9%
SRR6818630	SRX3775625	SRP134094	SAMN08648810	20,259,852	80%	9%
SRR6818621	SRX3775634	SRP134094	SAMN08648810	22,259,053	79%	9%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Pararge aegeria high-quality model RefSeq (XP_)	10,624	9,411 (88.58%)	9,411 (88.58%)	73.24%	83.52%
Insecta GenBank	114,166	71,767 (62.86%)	71,767 (62.86%)	69.01%	71.73%
Insecta known RefSeq (NP_)	39,088	16,679 (42.67%)	16,679 (42.67%)	64.08%	56.03%
Same-species GenBank	297	274 (92.26%)	274 (92.26%)	88.41%	94.21%
Tribolium castaneum high-quality model RefSeq (XP_)	11,487	8,401 (73.13%)	8,401 (73.13%)	62.34%	57.56%
Bombyx mori high-quality model RefSeq (XP_)	10,925	9,666 (88.48%)	9,666 (88.48%)	69.74%	78.04%
Apis mellifera high-quality model RefSeq (XP_)	8,880	6,687 (75.30%)	6,687 (75.30%)	63.54%	60.15%
Papilio polytes high-quality model RefSeq (XP_)	8,718	8,210 (94.17%)	8,210 (94.17%)	70.53%	80.32%

Assembly-assembly alignments of current to previous assembly

When the assembly changes between two rounds of annotation, genes in the current and the previous annotation are mapped to each other using the genomic alignments of the current assembly to the previous assembly so that gene identifiers can be preserved. The success of the remapping depends largely on how well the two assembly versions align to each other.

Below are the percent coverage of one assembly by the other and the average percent identity of the alignments. The 'First pass' alignments are reciprocal best hits, while the 'Total' alignments also include 'Second pass' or non-reciprocal best alignments. For more information about the assembly-assembly alignment process, please visit the NCBI Genome Remapping Service page.

First Pass	Total
ilPieRapa1.1 (Current) Coverage: 86.67%	ilPieRapa1.1 (Current) Coverage: 88.42%
P_rapae_3842_assembly_v2 (Previous) Coverage: 91.54%	P_rapae_3842_assembly_v2 (Previous) Coverage: 92.08%
Percent Identity: 95.29%	Percent Identity: 95.28%

Comparison of the current and previous annotations

The annotation produced for this release (101) was compared to the annotation in the previous release (100) for each assembly annotated in both releases. Scores for current and previous gene and transcript features were calculated based on overlap in exon sequence and matches in exon boundaries. Pairs of current and previous features were categorized based on these scores, whether they are reciprocal best matches, and changes in attributes (gene biotype, completeness, etc.). If the assembly was updated between the two releases, alignments between the current and the previous assembly were used to match the current and previous gene and transcript features in mapped regions.

The table below summarizes the changes in the gene set for each assembly as a percent of the number of genes in the current annotation release, and provides links to the details of the comparison in tabular format and in a Genome Workbench project.

	ilPieRapa1.1 (Current) to P_rapae_3842_assembly_v2 (Previous)
Identical	5%
Minor changes	72%
Major changes	11%
New	11%
Deprecated	10%
Other	<1%
Download the report	tabular, Genome Workbench

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20
Minimap2: Li H. Bioinformatics 2018 Sep 15;34(18):3094-3100

RefSeq

Integrated reference sequences