Accurate identification of taxon-specific molecular markers in plants based on DNA signature sequence

Mol Ecol Resour. 2023 Jan;23(1):106-117. doi: 10.1111/1755-0998.13697. Epub 2022 Aug 24.

Abstract

Accurate identification of plants remains a significant challenge for taxonomists and is the basis for plant diversity conservation. Although DNA barcoding methods are commonly used for plant identification, these are limited by the low amplification success and low discriminative power of selected genomic regions. In this study, we developed a k-mer-based approach, the DNA signature sequence (DSS), to accurately identify plant taxon-specific markers, especially at the species level. DSS is a constant-length nucleotide sequence capable of identifying a taxon and distinguishing it from other taxa. In this study, we performed the first large-scale study of DSS markers in plants. DSS candidates of 3899 angiosperm plant species were calculated based on a chloroplast data set with 4356 assemblies. Using Sanger sequencing of PCR amplicons and high-throughput sequencing, DSSs were validated in four and 165 species, respectively. Based on this, the universality of the DSSs was over 79.38%. Several indicators influencing DSS marker identification and detection have also been evaluated, and common criteria for DSS application in plant identification have been proposed.

Keywords: DNA signature sequence; chloroplast genome; plants; species identification.

MeSH terms

  • DNA Barcoding, Taxonomic / methods
  • DNA, Plant / genetics
  • Genetic Markers
  • High-Throughput Nucleotide Sequencing
  • Magnoliopsida* / genetics
  • Phylogeny
  • Plants* / genetics
  • Polymerase Chain Reaction
  • Sequence Analysis, DNA / methods

Substances

  • Genetic Markers
  • DNA, Plant