NCBI Lytechinus pictus Annotation Release GCF_015342785.2-RS_2023_03

The genome sequence records for Lytechinus pictus RefSeq assembly GCF_015342785.2 (UCSD_Lpic_2.1) were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
BUSCO results: Annotation completeness assessed with BUSCO
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as "GCF_015342785.2-RS_2023_03".

Date of Entrez queries for transcripts and proteins: Mar 31 2023
Date of submission of annotation to the public databases: Apr 4 2023
Software version: 10.1

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
UCSD_Lpic_2.1	GCF_015342785.2	University of North Carolina Wilmington	03-08-2023	Reference	19 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	UCSD_Lpic_2.1
Genes and pseudogenes	30,773
protein-coding	24,732
non-coding	4,353
Transcribed pseudogenes	0
Non-transcribed pseudogenes	1,688
genes with variants	2,196
Immunoglobulin/T-cell receptor gene segments	0
other	0
mRNAs	27,757
fully-supported	18,235
with > 5% ab initio	5,026
partial	653
with filled gap(s)	0
known RefSeq (NM_)	0
model RefSeq (XM_)	27,757
non-coding RNAs	4,651
fully-supported	2,799
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	3,120
pseudo transcripts	0
fully-supported	0
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	0
CDSs	27,757
fully-supported	18,235
with > 5% ab initio	5,635
partial	653
with major correction(s)	1,151
known RefSeq (NP_)	0
model RefSeq (XP_)	27,757

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	29,085	12,145	8,266	70	160,609
All transcripts	32,408	2,152	1,536	70	35,055
mRNA	27,757	2,402	1,743	174	35,055
misc_RNA	156	3,152	2,451	159	11,445
tRNA	1,531	75	73	70	96
lncRNA	2,643	905	494	84	10,794
snoRNA	98	119	100	71	245
snRNA	187	145	141	99	199
rRNA	36	867	119	118	3,797
Single-exon transcripts	2,578	1,582	1,272	258	17,755
coding transcripts (NM_/XM_ )	2,578	1,582	1,272	258	17,755
CDSs	27,757	1,640	1,194	174	34,548
Exons	205,777	287	145	1	17,755
in coding transcripts (NM_/XM_ )	198,404	284	145	1	17,755
in non-coding transcripts (NR_/XR_ )	8,277	330	134	10	10,470
Introns	178,145	1,712	927	30	136,174
in coding transcripts (NM_/XM_ )	173,455	1,711	927	30	136,174
in non-coding transcripts (NR_/XR_ )	5,579	1,715	921	30	9,991

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	1.12	1	1	21
Number of exons per transcript	7.99	5	1	182

BUSCO analysis of gene annotation

BUSCO v4.1.4 was run in "protein" mode on the annotated gene set picking one longest protein per gene, and run using the metazoa_odb10 lineage dataset. Results are reported for the gene set from the primary assembly unit, and presented in BUSCO notation.

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the UniProtKB/Swiss-Prot curated proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 24732 coding genes, 17675 genes had a protein with an alignment covering 50% or more of the query and 3775 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: UniProtKB/Swiss-Prot curated proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker (if calculated), for each assembly. RepeatMasker results are only calculated for organisms with complete Dfam HMM model collections.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with WindowMasker
UCSD_Lpic_2.1	GCF_015342785.2	42.68%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez Nucleotide, Entrez Protein, and SRA, and aligned to the genome.

Transcript alignments

The alignments of the following transcripts with Splign were used for gene prediction:

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by Splign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species Genbank	17	16 (94.12%)	15 (88.24%)	97.83%	89.61%

RNA-Seq alignments

The alignments of the following RNA-Seq reads with STAR were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	Aggregate of all aligned samples	8,079,228,936	28%	23%	192,699
SAMN03160518	whole embryo (Lytechinus variegatus, 1hr, SAMN03160518)	135,391,860	49%	23%	148,738
SAMN03160519	whole embryo (Lytechinus variegatus, 2.5 hr, SAMN03160519)	109,530,556	54%	25%	147,906
SAMN03160520	whole embryo (Lytechinus variegatus, 4 hr, SAMN03160520)	138,294,514	31%	24%	141,227
SAMN03160521	whole embryo (Lytechinus variegatus, 7 hr, SAMN03160521)	222,708,356	44%	21%	153,583
SAMN03160522	whole embryo (Lytechinus variegatus, 10 hr, SAMN03160522)	205,656,504	47%	22%	155,934
SAMN03160523	whole embryo (Lytechinus variegatus, 12 hr, SAMN03160523)	182,947,818	39%	24%	151,238
SAMN03160524	whole embryo (Lytechinus variegatus, 13 hr, SAMN03160524)	34,690,108	30%	25%	121,348
SAMN03160525	whole embryo (Lytechinus variegatus, 15 hr, SAMN03160525)	33,787,794	51%	29%	136,567
SAMN03160526	whole embryo (Lytechinus variegatus, 18 hr, SAMN03160526)	34,415,592	34%	26%	130,182
SAMN03160527	whole larva (Lytechinus variegatus, 36 hr, SAMN03160527)	194,454,022	49%	27%	164,984
SAMN03160528	whole larva (Lytechinus variegatus, 48 hr, SAMN03160528)	177,754,036	46%	26%	160,701
SAMN03160529	whole embryo (Lytechinus variegatus, 13 hr, SAMN03160529)	25,678,186	25%	26%	112,875
SAMN03160530	whole embryo (Lytechinus variegatus, 18 hr, SAMN03160530)	33,821,818	41%	27%	132,446
SAMN03160531	whole embryo (Lytechinus variegatus, 15 hr, SAMN03160531)	28,841,880	52%	29%	134,138
SAMN03160532	whole embryo (Lytechinus variegatus, 13 hr, SAMN03160532)	30,090,180	29%	25%	119,929
SAMN03160533	whole embryo (Lytechinus variegatus, 15 hr, SAMN03160533)	28,644,016	56%	30%	135,798
SAMN03160534	whole embryo (Lytechinus variegatus, 18 hr, SAMN03160534)	34,244,902	37%	26%	132,974
SAMN12257707	Whole embryo (Lytechinus variegatus, SAMN12257707)	27,106,210	45%	21%	120,382
SAMN12257708	Whole embryo (Lytechinus variegatus, SAMN12257708)	26,699,180	45%	21%	118,269
SAMN12257709	Whole embryo (Lytechinus variegatus, SAMN12257709)	22,832,786	44%	22%	118,525
SAMN12257710	Whole embryo (Lytechinus variegatus, SAMN12257710)	28,616,686	46%	22%	120,720
SAMN12257711	Whole embryo (Lytechinus variegatus, SAMN12257711)	22,636,576	45%	22%	118,747
SAMN12257712	Whole embryo (Lytechinus variegatus, SAMN12257712)	23,274,068	45%	21%	115,765
SAMN12257713	Whole embryo (Lytechinus variegatus, SAMN12257713)	21,854,592	46%	21%	116,639
SAMN12257714	Whole embryo (Lytechinus variegatus, SAMN12257714)	26,057,630	45%	19%	117,262
SAMN12257715	Whole embryo (Lytechinus variegatus, SAMN12257715)	29,674,354	48%	21%	126,560
SAMN12257716	Whole embryo (Lytechinus variegatus, SAMN12257716)	28,834,782	46%	20%	124,108
SAMN12257717	Whole embryo (Lytechinus variegatus, SAMN12257717)	29,207,382	47%	21%	127,473
SAMN12257718	Whole embryo (Lytechinus variegatus, SAMN12257718)	29,440,688	47%	20%	126,495
SAMN12257719	Whole embryo (Lytechinus variegatus, SAMN12257719)	26,453,198	48%	21%	128,604
SAMN12257720	Whole embryo (Lytechinus variegatus, SAMN12257720)	30,365,230	48%	21%	130,449
SAMN12257721	Whole embryo (Lytechinus variegatus, SAMN12257721)	25,355,186	49%	22%	130,428
SAMN12257722	Whole embryo (Lytechinus variegatus, SAMN12257722)	23,066,278	48%	21%	127,583
SAMN12257723	Whole embryo (Lytechinus variegatus, SAMN12257723)	27,058,234	49%	22%	133,275
SAMN12257724	Whole embryo (Lytechinus variegatus, SAMN12257724)	33,744,276	48%	21%	135,796
SAMN12257725	Whole embryo (Lytechinus variegatus, SAMN12257725)	24,370,236	49%	23%	136,982
SAMN12257726	Whole embryo (Lytechinus variegatus, SAMN12257726)	23,377,224	49%	23%	136,929
SAMN12257727	Whole embryo (Lytechinus variegatus, SAMN12257727)	29,370,822	48%	22%	137,035
SAMN12257728	Whole embryo (Lytechinus variegatus, SAMN12257728)	23,774,104	48%	21%	133,256
SAMN12257729	Whole embryo (Lytechinus variegatus, SAMN12257729)	122,341,206	16%	22%	137,182
SAMN12257730	Whole embryo (Lytechinus variegatus, SAMN12257730)	115,734,548	41%	21%	156,788
SAMN12257731	Whole embryo (Lytechinus variegatus, SAMN12257731)	124,860,252	12%	21%	133,242
SAMN12257732	Whole embryo (Lytechinus variegatus, SAMN12257732)	122,924,104	41%	21%	157,969
SAMN12257733	Partial portion of embryo (Lytechinus variegatus, SAMN12257733)	96,837,982	45%	21%	151,762
SAMN12257734	Partial portion of embryo (Lytechinus variegatus, SAMN12257734)	79,019,422	43%	20%	150,055
SAMN12257735	Partial portion of embryo (Lytechinus variegatus, SAMN12257735)	95,407,356	45%	22%	151,942
SAMN12257736	Partial portion of embryo (Lytechinus variegatus, SAMN12257736)	99,320,740	44%	21%	154,018
SAMN12257737	Whole embryo (Lytechinus variegatus, SAMN12257737)	97,306,414	40%	20%	151,709
SAMN12257738	Whole embryo (Lytechinus variegatus, SAMN12257738)	118,612,592	39%	20%	154,444
SAMN12257739	Whole embryo (Lytechinus variegatus, SAMN12257739)	99,561,804	41%	20%	150,024
SAMN12257740	Whole embryo (Lytechinus variegatus, SAMN12257740)	118,471,080	27%	21%	144,218
SAMN12257741	Whole embryo (Lytechinus variegatus, SAMN12257741)	65,041,738	43%	22%	125,397
SAMN12257742	Whole embryo (Lytechinus variegatus, SAMN12257742)	65,848,060	44%	21%	122,111
SAMN29402202	whole embryo (Lytechinus variegatus, SAMN29402202)	48,143,242	49%	24%	131,656
SAMN29402203	whole embryo (Lytechinus variegatus, SAMN29402203)	47,984,520	47%	23%	137,635
SAMN29402204	whole embryo (Lytechinus variegatus, SAMN29402204)	48,289,888	47%	22%	143,039
SAMN29402205	whole embryo (Lytechinus variegatus, SAMN29402205)	48,284,990	49%	24%	131,891
SAMN29402206	whole embryo (Lytechinus variegatus, SAMN29402206)	48,224,048	43%	21%	138,122
SAMN29402207	whole embryo (Lytechinus variegatus, SAMN29402207)	48,036,470	46%	24%	127,489
SAMN29402208	whole embryo (Lytechinus variegatus, SAMN29402208)	48,004,166	48%	24%	135,558
SAMN29402209	whole embryo (Lytechinus variegatus, SAMN29402209)	47,711,542	49%	25%	129,572
SAMN29402210	whole embryo (Lytechinus variegatus, SAMN29402210)	47,919,442	46%	24%	134,816
SAMN29402211	whole embryo (Lytechinus variegatus, SAMN29402211)	47,856,474	48%	25%	120,078
SAMN29402212	whole embryo (Lytechinus variegatus, SAMN29402212)	48,106,972	49%	24%	126,696
SAMN29402213	whole embryo (Lytechinus variegatus, SAMN29402213)	47,835,782	49%	25%	112,859
SAMN29402214	whole embryo (Lytechinus variegatus, SAMN29402214)	48,219,462	45%	23%	137,423
SAMN29402215	whole embryo (Lytechinus variegatus, SAMN29402215)	48,368,838	47%	24%	136,533
SAMN29402216	whole embryo (Lytechinus variegatus, SAMN29402216)	48,603,200	44%	21%	138,732
SAMN29402217	whole embryo (Lytechinus variegatus, SAMN29402217)	48,635,432	43%	23%	131,987
SAMN29402218	whole embryo (Lytechinus variegatus, SAMN29402218)	48,692,340	41%	20%	140,396
SAMN29402219	whole embryo (Lytechinus variegatus, SAMN29402219)	48,642,778	44%	22%	134,825
SAMN29855636	8 hpf (Lytechinus variegatus, SAMN29855636)	836,712,930	6%	22%	115,085
SAMN29855637	6 hpf (Lytechinus variegatus, SAMN29855637)	809,207,912	6%	25%	117,516
SAMN29855639	2 hpf (Lytechinus variegatus, SAMN29855639)	812,336,060	7%	32%	125,759
SAMN31077260	embryonic (Lytechinus pictus, SAMN31077260)	259,713,662	67%	17%	167,303

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
SRR1661401	SRX766497	SRP050165	SAMN03160518	135,391,860	49%	23%
SRR1661399	SRX766496	SRP050165	SAMN03160519	109,530,556	54%	25%
SRR1661397	SRX766495	SRP050165	SAMN03160520	138,294,514	31%	24%
SRR1661409	SRX766494	SRP050165	SAMN03160521	222,708,356	44%	21%
SRR1661406	SRX766493	SRP050165	SAMN03160522	205,656,504	47%	22%
SRR1661395	SRX766174	SRP050165	SAMN03160523	182,947,818	39%	24%
SRR1661363	SRX766173	SRP050165	SAMN03160524	34,690,108	30%	25%
SRR1661113	SRX766172	SRP050165	SAMN03160525	33,787,794	51%	29%
SRR1661112	SRX766171	SRP050165	SAMN03160526	34,415,592	34%	26%
SRR1661111	SRX766170	SRP050165	SAMN03160527	194,454,022	49%	27%
SRR1661090	SRX766169	SRP050165	SAMN03160528	177,754,036	46%	26%
SRR1661081	SRX766167	SRP050165	SAMN03160529	25,678,186	25%	26%
SRR1661079	SRX766165	SRP050165	SAMN03160530	33,821,818	41%	27%
SRR1661077	SRX766163	SRP050165	SAMN03160531	28,841,880	52%	29%
SRR1661075	SRX766161	SRP050165	SAMN03160532	30,090,180	29%	25%
SRR1660833	SRX766159	SRP050165	SAMN03160533	28,644,016	56%	30%
SRR1660831	SRX766157	SRP050165	SAMN03160534	34,244,902	37%	26%
SRR9673388	SRX6433849	SRP214403	SAMN12257707	27,106,210	45%	21%
SRR9673387	SRX6433850	SRP214403	SAMN12257708	26,699,180	45%	21%
SRR9673386	SRX6433851	SRP214403	SAMN12257709	22,832,786	44%	22%
SRR9673385	SRX6433852	SRP214403	SAMN12257710	28,616,686	46%	22%
SRR9673384	SRX6433853	SRP214403	SAMN12257711	22,636,576	45%	22%
SRR9673383	SRX6433854	SRP214403	SAMN12257712	23,274,068	45%	21%
SRR9673382	SRX6433855	SRP214403	SAMN12257713	21,854,592	46%	21%
SRR9673381	SRX6433856	SRP214403	SAMN12257714	26,057,630	45%	19%
SRR9673390	SRX6433847	SRP214403	SAMN12257715	29,674,354	48%	21%
SRR9673389	SRX6433848	SRP214403	SAMN12257716	28,834,782	46%	20%
SRR9673408	SRX6433829	SRP214403	SAMN12257717	29,207,382	47%	21%
SRR9673407	SRX6433830	SRP214403	SAMN12257718	29,440,688	47%	20%
SRR9673410	SRX6433827	SRP214403	SAMN12257719	26,453,198	48%	21%
SRR9673409	SRX6433828	SRP214403	SAMN12257720	30,365,230	48%	21%
SRR9673404	SRX6433833	SRP214403	SAMN12257721	25,355,186	49%	22%
SRR9673403	SRX6433834	SRP214403	SAMN12257722	23,066,278	48%	21%
SRR9673406	SRX6433831	SRP214403	SAMN12257723	27,058,234	49%	22%
SRR9673405	SRX6433832	SRP214403	SAMN12257724	33,744,276	48%	21%
SRR9673402	SRX6433835	SRP214403	SAMN12257725	24,370,236	49%	23%
SRR9673401	SRX6433836	SRP214403	SAMN12257726	23,377,224	49%	23%
SRR9673391	SRX6433846	SRP214403	SAMN12257727	29,370,822	48%	22%
SRR9673392	SRX6433845	SRP214403	SAMN12257728	23,774,104	48%	21%
SRR9673393	SRX6433844	SRP214403	SAMN12257729	122,341,206	16%	22%
SRR9673394	SRX6433843	SRP214403	SAMN12257730	115,734,548	41%	21%
SRR9673395	SRX6433842	SRP214403	SAMN12257731	124,860,252	12%	21%
SRR9673396	SRX6433841	SRP214403	SAMN12257732	122,924,104	41%	21%
SRR9673397	SRX6433840	SRP214403	SAMN12257733	96,837,982	45%	21%
SRR9673398	SRX6433839	SRP214403	SAMN12257734	79,019,422	43%	20%
SRR9673399	SRX6433838	SRP214403	SAMN12257735	95,407,356	45%	22%
SRR9673400	SRX6433837	SRP214403	SAMN12257736	99,320,740	44%	21%
SRR9673380	SRX6433857	SRP214403	SAMN12257737	97,306,414	40%	20%
SRR9673379	SRX6433858	SRP214403	SAMN12257738	118,612,592	39%	20%
SRR9673378	SRX6433859	SRP214403	SAMN12257739	99,561,804	41%	20%
SRR9673377	SRX6433860	SRP214403	SAMN12257740	118,471,080	27%	21%
SRR9673376	SRX6433861	SRP214403	SAMN12257741	65,041,738	43%	22%
SRR9673375	SRX6433862	SRP214403	SAMN12257742	65,848,060	44%	21%
SRR19884298	SRX15927644	SRP383983	SAMN29402202	48,143,242	49%	24%
SRR19884299	SRX15927643	SRP383983	SAMN29402203	47,984,520	47%	23%
SRR19884300	SRX15927642	SRP383983	SAMN29402204	48,289,888	47%	22%
SRR19884301	SRX15927641	SRP383983	SAMN29402205	48,284,990	49%	24%
SRR19884302	SRX15927640	SRP383983	SAMN29402206	48,224,048	43%	21%
SRR19884303	SRX15927639	SRP383983	SAMN29402207	48,036,470	46%	24%
SRR19884304	SRX15927638	SRP383983	SAMN29402208	48,004,166	48%	24%
SRR19884305	SRX15927637	SRP383983	SAMN29402209	47,711,542	49%	25%
SRR19884306	SRX15927636	SRP383983	SAMN29402210	47,919,442	46%	24%
SRR19884307	SRX15927635	SRP383983	SAMN29402211	47,856,474	48%	25%
SRR19884308	SRX15927634	SRP383983	SAMN29402212	48,106,972	49%	24%
SRR19884309	SRX15927633	SRP383983	SAMN29402213	47,835,782	49%	25%
SRR19884310	SRX15927632	SRP383983	SAMN29402214	48,219,462	45%	23%
SRR19884311	SRX15927631	SRP383983	SAMN29402215	48,368,838	47%	24%
SRR19884312	SRX15927630	SRP383983	SAMN29402216	48,603,200	44%	21%
SRR19884313	SRX15927629	SRP383983	SAMN29402217	48,635,432	43%	23%
SRR19884314	SRX15927628	SRP383983	SAMN29402218	48,692,340	41%	20%
SRR19884315	SRX15927627	SRP383983	SAMN29402219	48,642,778	44%	22%
SRR20336823	SRX16369902	SRP387339	SAMN29855636	836,712,930	6%	22%
SRR20336824	SRX16369901	SRP387339	SAMN29855637	809,207,912	6%	25%
SRR20336826	SRX16369899	SRP387339	SAMN29855639	812,336,060	7%	32%
SRR21770805	SRX17765795	SRP400465	SAMN31077260	259,713,662	67%	17%

Protein alignments

The alignments of the following proteins with ProSplign were used for gene prediction:

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Acanthaster planci high-quality model RefSeq (XP_)	14,303	11,231 (78.52%)	11,231 (78.52%)	64.08%	56.43%
Patiria miniata high-quality model RefSeq (XP_)	16,681	12,939 (77.57%)	12,939 (77.57%)	64.03%	55.01%
Drosophila melanogaster known RefSeq (NP_)	30,786	14,475 (47.02%)	14,475 (47.02%)	64.37%	51.54%
Asterias rubens high-quality model RefSeq (XP_)	12,512	10,048 (80.31%)	10,048 (80.31%)	65.18%	59.91%
Echinoidea GenBank	2,442	2,043 (83.66%)	2,043 (83.66%)	68.68%	78.96%
Same-species GenBank	16	16 (100.00%)	16 (100.00%)	87.61%	95.32%
Strongylocentrotus purpuratus high-quality model RefSeq (XP_)	19,173	17,628 (91.94%)	17,628 (91.94%)	74.57%	81.79%
Strongylocentrotus purpuratus known RefSeq (NP_)	425	402 (94.59%)	402 (94.59%)	79.74%	84.21%
Homo sapiens known RefSeq (NP_)	66,907	40,208 (60.10%)	40,208 (60.10%)	62.27%	49.83%

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
BUSCO: Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. Molecular biology and evolution 2021.38(10):4647-4654
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20
STAR: Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. Bioinformatics 2013 Jan 1;29(1):15-21.
Minimap2: Li H. Bioinformatics 2018 Sep 15;34(18):3094-3100

RefSeq

Integrated reference sequences