Setting-up an instance on Google Cloud Platform (GCP)
Overview
The Google Cloud Platform (GCP) is where all of Google Cloud services are located and available via their console page. You will be accessing the Compute Engine service on GCP to create a virtual machine (VM) instance. You can access the SRA data for free in any US region and any zone within the US region. This guide will help you get started with a basic instance you can use to become familiar with the whole process. Helpful tips on creating your sample instance using default settings are included below.
You may need to expand the CPU, memory, or size of the boot disk once you are ready to do your analysis.
Sign-in and Enter the GCP Console
Sign-in using your Google account: GCP Console
Launch a GCP Instance
Create a GCP Instance
- Click the Create Instance button on the top header which will take you to the instance configuration page.
- Create a name for the instance that includes a reminder of the characteristics of the machine to make it easier when you return, for example, "us-east-10gb".
Settings
Select region and zone to create the instance. For free access to the SRA data, choose US region us-east1 and select a zone from the options, for example, us-east1 (region) and us-east1-b (zone). Click the button Save.
Machine Configuration
For your basic instance, you can leave defaults but once you are ready to analyze the SRA data, this is where you would change the CPU and RAM requirements for your instance.
Boot Disk
Do not change the default (10Gb) for your first test. Note that this is where you would change the amount of disk space to which you have access. To download large amounts of data, you will need to increase the available disk memory based on your needs.
Here, you can also choose in which region and zone to create the instance.
There is also a link provided to view Google's most recent Compute Engine pricing.
Connecting to an Instance
You will need to follow Google's guide Connecting to instances to connect to your instance.
The SRA Toolkit on the GCP
Installing The SRA Toolkit in your instance
Once you connected, you will be able to work in Unix-like command line environment where you can install and configure the SRA Toolkit.
Using the SRA Toolkit on the GCP
- For downloading public SRA data from our cloud buckets to your cloud storage you can use the SRA Toolkit utilities as described in the SRA Download Guide
- For downloading dbGAP data from our cloud buckets to your cloud storage you need to use jwt.cart file as descibed in the Downloading dbGaP data with JWT
Youtube Video Tutorial - Setting up GCP - demo
Engage
NCBI wants your feedback on SRA in the Cloud. Contact sra@ncbi.nlm.nih.gov with questions or if you would like to provide input on new functionality.