HTTPS at NCBI: Guidance for Users
What is happening?
To improve security and privacy, and by Federal government mandate, NCBI moved its Web sites to HTTPS only by September 30, 2016.
To give software vendors time to respond, that deadline was extended for users of NCBI Web APIs to November 9, 2016.
This document is retained for historical and informational purposes. Advice appearing on this page about HTTPS compliance remains valid.
If you use NCBI only through a Web browser (like Safari, Firefox, Chrome, Internet Explorer, Opera, etc.), this document is not of interest to you. The only change you should notice after the deadline is that a green lock icon should appear inside the box, and the web addresses of the NCBI pages you visit will start with https://
.
If you maintain software that uses NCBI APIs or accesses NCBI servers through the Web, you should understand and act before the deadline to ensure uninterrupted service.
NCBI Web services include APIs such as NCBI eutilities and BLAST URLAPI that client applications use to access NCBI data. A number of them (though not a comprehensive set) are listed on or linked from our APIs page.
Applications that access NCBI web servers using http://
URLs, instead of https://
URLs, may fail partially or completely after NCBI switches to HTTPS-only.
This document explains our transition plan, and provides guidance to developers about how to update their applications (scripts, server-side applications like CGIs, browser plugins, etc.), before the switchover, to prevent failure.
NCBI is moving all web services to HTTPS
The HTTP protocol does not provide encryption, so anyone who can see web traffic between a client (for example, a web browser) and a server can intercept potentially sensitive information, and/or inject malware into users' browsers or operating systems. HTTPS solves this problem. It works just like HTTP, except that traffic is encrypted in both directions, so observers between the client and the server can't intercept or tamper with the requests or responses. It also provides authentication, ensuring that the client is communicating with the intended server given by the hostname, and not some impostor.
The Federal Office of Management and Budget requires all Federal Web sites to switch to HTTPS-only (meaning, HTTP will be disabled) by December 31, 2016. However, NCBI, being a part of the National Library of Medicine, had an earlier deadline of September 30, 2016.
All public-facing web pages at NCBI now operate exclusively over https. To give software vendors and their customers more time to update their software, NCBI extended the deadline for web service https compliance to November 9, 2016.
Update your applications as soon as possible
NCBI Web resources are all available now on HTTPS, so you can update your software immediately.
To ensure that your applications work before and after the switchover, update them so that URLs for all requests to NCBI servers start with https:
instead of http:
. For example, if your application searches PubMed using http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi, update it to use https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi instead. Please report any problems you encounter to info@ncbi.nlm.nih.gov.
Many script authors access NCBI services using third-party libraries like biojava, bioperl, biopython, bioruby, etc. In these cases, you may be able to update your application by simply updating the library you use to the most recent version. The table below provides information on versions of libraries we know about that already use HTTPS to interact with NCBI servers.
Library
|
Uses HTTPS for NCBI services?
|
Compliant Version
|
---|---|---|
BioJava | Yes |
biojava-legacy 1.9.3 |
BioPerl | Yes | BioPerl 1.7 (pull request) |
Biopython | Yes | Biopython 1.67 (release notes) |
BioRuby | Yes | BioRuby 1.5.1 (github issue,) (release notes) |
biogo | Yes | most recent HEAD revision of master branch (github issue) |
reutils (R) | Yes | 0.2.3, see https://github.com/gschofl/reutils (github issue) |
Once you have updated and tested your application, it will continue to work as before, and no other action is required. This is the best option for scripts, CGIs, and other Web client software for which you have the source code and the ability to update it and deploy a new release before the deadline.
After November 9, 2016, NCBI HTTP servers will redirect or reject all HTTP requests.
All interactive web traffic to NCBI servers has been successfully moved to HTTPS. After the switchover date, November 9, 2016, requests to web services such as eutilities and BLAST URLAPI will also begin redirecting http requests to https.
If you do not update your application before the switchover date, these redirects from NCBI HTTP servers may buy you time to make the updates later.
After November 9, 2016, all traffic from NCBI HTTP servers, including Web services, will:
- respond with a server-side redirect (
HTTP 301 Moved permanently
) to the corresponding URL on HTTPS, only forHTTP GET
andHEAD
requests; - respond with
HTTP 403 Forbidden
and an error message, to all requests other thanGET
andHEAD
(including and especiallyHTTP POST
); - include in every HTTPS response an HTTP Strict Transport Security (HSTS) header, which instructs browsers to automatically communicate thereafter only with HTTPS on that domain. (HSTS applies only to browsers, though other Web clients like scripts are free to implement it.) The HSTS header has a 1-year expiration date.
- include in every HTTPS response the header
Content-Security-Policy: upgrade-insecure-requests
, which causes most browsers to automatically upgradehttp://
links tohttps://
, automatically avoiding most mixed content problems.
After switchover, the HTTP redirects will remain in place for an as-yet undetermined period, but at least until the Federal deadline of December 31, 2016.
After switchover, applications that access NCBI APIs using HTTP may fail
After the switchover date, applications that still try to access NCBI via HTTP (i.e., on port 80) may fail for a few possible reasons:
- Your programming environment's HTTP facility does not automatically follow redirects from HTTP to HTTPS. Some libraries follow redirections from HTTP to HTTPS; others do not. Java's
URLConnection
, for example, does not automatically follow HTTP-to-HTTPS redirects by design, even for safe methods likeGET
andHEAD
. - Your application uses HTTP verbs other than GET and HEAD. All other HTTP requests (including especially
POST
andPUT
requests) to HTTP URLs at NCBI will fail unconditionally (withHTTP 403 Forbidden)
after the switchover date. - Your application access NCBI resources through a proxy. Some organizations use proxy servers to access the NCBI web site. These proxy servers must communicate with NCBI using https, which means they need valid certificates. If your application access NCBI through a proxy, check with the proxy vendor about https support and how to add or update certificates.
- Your programming environment does not support HTTPS.
In any of these cases, if the application does not work with https, the only solution is to update your all NCBI URLs to use HTTPS exclusively.
Some requests will be temporarily exempt from redirection
For various technical reasons, certain requests will be temporarily exempt from redirection. Once the underlying technical issue is resolved, the exemption will be lifted, and redirection will begin without further public warning.
The following http requests will be temporarily exempt from redirection:
- Requests with request-uri matching the regular expression
\.(xsd|xml|dtd|ent)$
- Requests to the hosts
dtd.nlm.nih.gov
andjats.nlm.nih.gov
Redirects will be maintained indefinitely
All public NCBI servers are already enabled for HTTPS, so you can update your application to use HTTPS now, and test it on our live servers. Once you have updated to HTTPS, no further action is required. Please send questions or report problems to info@ncbi.nlm.nih.gov.
In keeping with current US Federal Government policy, NCBI intends to maintain these redirects on public servers indefinitely. Nevertheless, it is to your advantage to update you applications to use https only as soon as possible, both for performance and security reasons.
About Referrers
A "referrer" is an HTTP header, HTTP_REFERER
[sic],
that contains the address of the webpage that linked to the page
being retrieved. Some websites analyze referrers to better
understand their incoming web traffic; for example, to find out
what percentage of their traffic comes from a particular search
engine. But third-party websites can also use referrer information
to discover information about individual users,
such as their search terms and the pages they have visited.
Because of this privacy concern, NCBI's website tells web
browsers to limit the referrer to just the scheme and domain
name (e.g., https://www.ncbi.nlm.nih.gov), and to omit the
request URI and query string. This limitation is enforced
by the Referrer-Policy
HTTP header and the
<meta name="referrer">
meta tag. Limiting the
referrer to just the scheme and domain name
balances the user's right to privacy with website
owners' need to understand their web traffic. See
The Meta
Referrer Tag: An Advancement for SEO and the Internet
for a detailed description of the problem and the solution.
This policy follows official cio.gov guidance on referrers; see http://bit.ly/gov-https-referrer for details.
For more information
For more on the US Federal government HTTPS-only initiative, see https://https.cio.gov.
For questions, comments, or problems, contact the NCBI service desk at info@ncbi.nlm.nih.gov.