NIH Preprint Pilot
The NIH Preprint Pilot is a project of the National Library of Medicine (NLM). During the pilot, NLM will make preprints resulting from research funded by the National Institutes of Health (NIH) available via PubMed Central (PMC) and, by extension, PubMed. The pilot aims to explore approaches to increasing the discoverability of early NIH research results posted to eligible preprint servers. PMC already makes available more than one million peer-reviewed papers resulting from NIH-supported research collected under the NIH Public Access Policy. This pilot builds on PMC's NIH repository role as well as 2017 NIH guidance (NOT-OD-17-050) that encourages investigators to use interim research products, such as preprints, to speed the dissemination and enhance the rigor of their work.
What is a preprint?
Preprints are complete and public drafts of scientific documents, not yet certified by peer review. These documents ensure that the findings of the research community are widely disseminated, priorities of discoveries are established and they invite feedback and discussion to help improve the work.
Certification by peer review is the key distinction between a preprint and an accepted author manuscript or published article. Many preprints are submitted to journals for publication, and as a result, subsequent versions of the paper may also be made available after peer review. Readers of preprints should be aware that any aspect of the research, including the results and conclusions, may change as a result of peer review (see PMC Disclaimer). Authors may also revise preprints and post updated versions to the preprint server.
Preprints in PMC and PubMed
From June 2020 through June 2022, NLM made more than 3,300 preprints reporting NIH-supported COVID-19 research discoverable in PubMed Central (PMC) and PubMed, during Phase 1 of the NIH Preprint Pilot. This narrowly scoped Phase 1 demonstrated that preprint records in PMC and PubMed could provide an avenue for discovery of NIH-supported research prior to journal publication during the ongoing public health emergency, accelerating the point at which this research would otherwise be discoverable in PMC and PubMed.
Phase 2 of the NIH Preprint Pilot launched on January 30, 2023. This phase expands the scope of preprints included in PMC and PubMed beyond COVID, to include all preprints reporting NIH-funded research and posted to an eligible preprint server.
Visit our About PMC page to learn more about other types on content in PMC.
Phase 1 Scope (June 2020 - December 2022)
The first phase of the NIH Preprint Pilot focused on increasing the discoverability of preprints with NIH support relating to the SARS-CoV-2 virus and COVID-19.
Phase 2 Scope (January 2023 - )
The second phase of the NIH Preprint Pilot launched January 30, 2023, and encompasses all preprints that
- Acknowledge direct NIH support and/or have an NIH-affiliated author; and
- Are posted to an eligible preprint server on January 1, 2023 or later.
Eligible Preprint Servers
Phase 1 of the pilot (June 2020 – present) has included preprints with NIH support identified in the iSearch COVID-19 Portfolio tool developed by the NIH Office of Portfolio Analysis.
Phase 2 is limited initially to those preprint servers that have been identified in Phase 1 as making content available in a way that is scalable across the spectrum of NIH research:
- arXiv, and
- Research Square.
In determining eligibility of a preprint server for inclusion in the pilot, NLM considers the following server policies and practices:
- Preprint records should be clearly marked as having not been certified by peer review.
- A screening process should be posted publicly, in addition to any policies relating to screening procedures, such as scope.
- Versioning of preprints should be supported and transparent, and links from the preprint to the published journal article should be provided, when available.
- Clear description of license option(s) under which a preprint can be posted should be public.
- Preprint metadata should be openly available in human- and machine-readable formats.
- Preprint content should be free to read without registration.
- Transparent and rigorous policies should be posted about plagiarism, competing interests, and misconduct.
- Publicly stated archiving strategy to ensure long-term access.
- When circumstances dictate that a preprint must be withdrawn or removed, a preprint server should act consistently with their publicly stated policy for addressing the specific circumstances. Any action should be accompanied by a public notification.
These considerations are based on NIH guidance for selecting interim research product repositories (NOT-OD-17-050) and the recommendations for preprint servers outlined in the Committee on Publication Ethics Discussion Document (Version 1). Where applicable to preprints, NLM also looks for conformance with the Principles of Transparency and Best Practice in Scholarly Publishing (joint statement by COPE, DOAJ, WAME, and OASPA).
Further, in determining pilot eligibility, NLM also considers:
- The estimated volume of preprints with NIH support currently available in a preprint server;
- The scope of a preprint server; and
- The completeness of openly available metadata.
Finally, consistent with NIH guidance to investigators (NOT-OD-17-050), NLM strongly encourages eligible preprint servers to make available Creative Commons Attribution license options or the option to dedicate the work to the public domain.
NLM anticipates that these considerations may evolve over time as we continue to engage with the preprint server and broader scholarly communications communities.
To ensure that researchers, clinicians, and the public can all easily distinguish between preprints and the journal literature, PMC and PubMed include a prominent green information panel on all preprint records. The text in this panel notes that the article has not yet been peer reviewed and includes a link to more information about the “NIH Preprint Pilot.” A “Preprint” indicator has also been added to the citation metadata and Cite tool, both on the record page and in the search results of PMC and PubMed.
How Preprints Are Added to PMC and PubMed
The NIH Preprint Pilot workflow aims to minimize effort required by NIH authors and investigators to make their research results posted as preprints discoverable in PMC and PubMed.
Once a week, NLM identifies preprints that are in scope for the pilot through available tools. Identification of in-scope preprints is done through a combination of text mining for acknowledgment of direct NIH support and use of the NIH Office of Portfolio Analysis tool to identify preprints with NIH-affiliated authors.
Preprint citation and abstract metadata is pulled from available web services to create an “article header” record in PMC, which includes any metadata for the preprint that can be accessed in machine-readable format. A PMCID is assigned at this time. This early record aims to facilitate rapid discovery.
All preprint records in PMC and PubMed link to the complete record on the preprint server website.
Once loaded to PMC, a corresponding PubMed record is created.
At the same time, those preprints that are made available under a Creative Commons license enter a conversion workflow to create archival full-text XML for PMC to enable broader discovery and support preservation.
The conversion process will take a few days.
Step 4 (for those preprints that undergo full text conversion)
The full-text web version will be made available in PMC for full-text searching and integrated with NLM’s other literature and data resources.
Considerations for NIH Authors and Investigators
What authors and investigators need to know or consider about the NIH Preprint Pilot when posting a preprint:
- Acknowledge NIH Support: There is no submission system for depositing preprints to PMC. Rather, investigators should follow guidance NIH released in March 2017 (NOT-OD-17-050) to enable identification of preprints with NIH support, including clearly acknowledging NIH support. Preprints that do not clearly acknowledge direct NIH support will be considered out of scope.
- Consider Open License Options: NIH encourages investigators to select a Creative Commons Attribution (CC-BY) license or dedicate their work to the public domain. All preprints made available under a Creative Commons license will undergo full-text indexing in PMC to support preservation and greater discovery.
- Allow Time for Preprint Processes: New preprint records are ingested to PMC weekly, however, it may take up to two weeks for your preprint to be added to PMC and PubMed. If your preprint is in scope for the pilot but hasn’t been added to PMC within two weeks of posting, please contact firstname.lastname@example.org.
Finally, as a reminder NIH investigators can include and link preprints to their award in My Bibliography and report them to NIH as products of award on their progress report publication list.