U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

Links from GEO Profiles

    • Showing Current items.

    SEMG1 semenogelin 1 [ Homo sapiens (human) ]

    Gene ID: 6406, updated on 28-Oct-2024

    Summary

    Official Symbol
    SEMG1provided by HGNC
    Official Full Name
    semenogelin 1provided by HGNC
    Primary source
    HGNC:HGNC:10742
    See related
    Ensembl:ENSG00000124233 MIM:182140; AllianceGenome:HGNC:10742
    Gene type
    protein coding
    RefSeq status
    REVIEWED
    Organism
    Homo sapiens
    Lineage
    Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo
    Also known as
    SGI; SEMG; CT103; dJ172H20.2
    Summary
    The protein encoded by this gene is the predominant protein in semen. The encoded secreted protein is involved in the formation of a gel matrix that encases ejaculated spermatozoa. This preproprotein is proteolytically processed by the prostate-specific antigen (PSA) protease to generate multiple peptide products that exhibit distinct functions. One of these peptides, SgI-29, is an antimicrobial peptide with antibacterial activity. This proteolysis process also breaks down the gel matrix and allows the spermatozoa to move more freely. This gene and another similar semenogelin gene are present in a gene cluster on chromosome 20. [provided by RefSeq, Feb 2016]
    Expression
    Low expression observed in reference dataset See more
    Orthologs
    NEW
    Try the new Gene table
    Try the new Transcript table

    Genomic context

    See SEMG1 in Genome Data Viewer
    Location:
    20q13.12
    Exon count:
    3
    Annotation release Status Assembly Chr Location
    RS_2024_08 current GRCh38.p14 (GCF_000001405.40) 20 NC_000020.11 (45207033..45209768)
    RS_2024_08 current T2T-CHM13v2.0 (GCF_009914755.1) 20 NC_060944.1 (46942926..46945661)
    RS_2024_09 previous assembly GRCh37.p13 (GCF_000001405.25) 20 NC_000020.10 (43835674..43838409)

    Chromosome 20 - NC_000020.11Genomic Context describing neighboring genes Neighboring gene BRD4-independent group 4 enhancer GRCh37_chr20:43803133-43804332 Neighboring gene long intergenic non-protein coding RNA 2597 Neighboring gene peptidase inhibitor 3 Neighboring gene uncharacterized LOC105372630 Neighboring gene semenogelin 2 Neighboring gene uncharacterized LOC124904913

    Genomic regions, transcripts, and products

    Expression

    • Project title: HPA RNA-seq normal tissues HPA RNA-seq normal tissues
    • Description: RNA-seq was performed of tissue samples from 95 human individuals representing 27 different tissues in order to determine tissue-specificity of all protein-coding genes
    • BioProject: PRJEB4337
    • Publication: PMID 24309898
    • Analysis date: Wed Apr 4 07:08:55 2018

    Bibliography

    GeneRIFs: Gene References Into Functions

    What's a GeneRIF?

    HIV-1 interactions

    Protein interactions

    Protein Gene Interaction Pubs
    Pr55(Gag) gag Cellular biotinylated semenogelin I protein (SEMG1) is incorporated into HIV-1 Gag virus-like particles PubMed

    Go to the HIV-1, Human Interaction Database

    Pathways from PubChem

    Interactions

    Products Interactant Other Gene Complex Source Pubs Description

    General gene information

    Markers

    Potential readthrough

    Included gene: SEMG2

    Clone Names

    • FLJ78262, MGC14719

    Gene Ontology Provided by GOA

    Function Evidence Code Pubs
    enables protein binding IPI
    Inferred from Physical Interaction
    more info
    PubMed 
    enables zinc ion binding IMP
    Inferred from Mutant Phenotype
    more info
    PubMed 

    General protein information

    Preferred Names
    semenogelin-1
    Names
    SgI-29
    cancer/testis antigen 103
    semen coagulating protein
    semenogelin I

    NCBI Reference Sequences (RefSeq)

    NEW Try the new Transcript table

    RefSeqs maintained independently of Annotated Genomes

    These reference sequences exist independently of genome builds. Explain

    These reference sequences are curated independently of the genome annotation cycle, so their versions may not match the RefSeq versions in the current genome build. Identify version mismatches by comparing the version of the RefSeq in this section to the one reported in Genomic regions, transcripts, and products above.

    mRNA and Protein(s)

    1. NM_003007.5NP_002998.1  semenogelin-1 preproprotein

      See identical proteins and their annotated locations for NP_002998.1

      Status: REVIEWED

      Source sequence(s)
      AA687874, BC055416, BP325874, J04440
      Consensus CDS
      CCDS13345.1
      UniProtKB/Swiss-Prot
      P04279, Q53ZV0, Q53ZV1, Q53ZV2, Q6X4I9, Q6Y809, Q6Y822, Q6Y823, Q86U64, Q96QM3
      Related
      ENSP00000361867.3, ENST00000372781.4
      Conserved Domains (1) summary
      pfam05474
      Location:1462
      Semenogelin; Semenogelin

    RefSeqs of Annotated Genomes: GCF_000001405.40-RS_2024_08

    The following sections contain reference sequences that belong to a specific genome build. Explain

    Reference GRCh38.p14 Primary Assembly

    Genomic

    1. NC_000020.11 Reference GRCh38.p14 Primary Assembly

      Range
      45207033..45209768
      Download
      GenBank, FASTA, Sequence Viewer (Graphics)

    Alternate T2T-CHM13v2.0

    Genomic

    1. NC_060944.1 Alternate T2T-CHM13v2.0

      Range
      46942926..46945661
      Download
      GenBank, FASTA, Sequence Viewer (Graphics)

    Suppressed Reference Sequence(s)

    The following Reference Sequences have been suppressed. Explain

    1. NM_198139.1: Suppressed sequence

      Description
      NM_198139.1: This RefSeq was permanently suppressed because the transcript lacked a 180 nt repeat unit in the coding sequence compared to the reference genome sequence.