Guides
Guides

NCBI SRA Import

The Sequence Read Archive (SRA) is a publicly accessible database of raw sequencing data hosted NCBI. It contains high-throughput sequencing data from numerous research projects worldwide, covering a wide range of study types and organisms, including metagenomic and microbiome studies. Researchers deposit their sequencing reads in SRA to make data publicly available in accordance with scientific publication practices, fostering transparency and reproducibility in science. This is typically referenced in the paper with a BioProject number in the Data Availability section.

👍

Why Import SRA Data for Microbiome Analysis?

Access to Publicly Available Data: Researchers can leverage publicly available SRA datasets to replicate studies, validate findings, or perform meta-analyses against their own data.

Cost Efficiency: Using existing SRA datasets reduces the need for sequencing of new samples when performing secondary analyses or meta-analyses. By importing previously published data, users can rapidly evaluate different datasets or integrate multiple studies without the need to collect and sequence new samples.

Extended Contextual Analysis: By analyzing previously published microbiome datasets from SRA, researchers can add context to their own data. This can provide valuable insights by comparing microbiomes across different populations, environments, diseases, or experimental conditions.

Data Reuse for Novel Insights: Many research questions can be answered by revisiting older datasets with new hypotheses or more advanced analysis techniques. The CosmosID-HUB allows users to explore data from fresh perspectives using its comprehensive suite of microbiome analysis tools, potentially leading to novel discoveries.


To Get Started:

1. Navigate to the SRA Upload tool within the Upload Menu

From anywhere on the app, you can expand the navigation menu by clicking on the hamburger icon alt texton the upper left side of the screen and click "Upload" alt text to navigate to the upload page.

Click "NCBI SRA" to launch input your SRA Accession Numbers

2. Input your SRA Accession Numbers

SRA samples from Shotgun Metagenomic, 16S, or ITS samples can be input using SRR/DRR/ERR Accessions.

Input up to 500 accessions (separated by comma, space, or semicolon) and click "Search". The tool will find the data files (either single-end or paired-end) and report if any are unavailable.


3. Select your desired folder location and data type

NOTE: Folders must be pre-generated in the "Cohorts and Metadata" menu before uploading samples. You can also upload to the /Home folder and move the samples into subfolders after upload.

You are required to select the TYPE of data from the following options:

  • Shotgun Metagenomics - using whole genome shotgun sequencing, the CosmosID algorithms identify microorganisms based on entire genomes represented in our database and in your samples
  • Amplicon 16S - the amplicon 16S is for bacterial 16S identification. Unlike shotgun metagenomics, amplicon 16S analysis looks only at the relevant bacterial 16S rRNA genes, not the entire genome for identification. This option requires the primer sequences used for generating libraries (e.g,. V1V2, V3V4, V4).
    • You can select from standard primer sequences or input your own primer sequence
  • Amplicon ITS - the amplicon ITS database is for fungal ITS identification.

3. Remove host reads

Selecting the sample host can remove host-reads for increased precision and accuracy in microbiome analysis, especially for functional profiling. You can select from various hosts, from humans to domestic cats!

📘

If running CHAMP Human Profiler:

Select "None" as the host when running the CHAMP Human Microbiome Profiler. CHAMP has a native host-removal algorithm for removing human host reads, and selecting a host for this pipeline will result in the sample analysis failing.


4. Upload your Metadata (Optional)

Upload your metadata through a CSV template. This can also be skipped and performed after data upload.

5. Select your workflow

The HUB has several workflows to chose from when uploading your samples:

Shotgun Metagenomic Data

CHAMP™ Human Taxonomic & Functional Microbiome Profiling

Kepler Host-Agnostic Taxonomic Microbiome Profiling

Host-Agnostic Functional Profiling

AMR and Virulence Markers (module required)

Amplicon Data

ASV 16S rRNA Amplicon Profiling

OTU ITS rRNA Amplicon Profiling

Credit charges are assessed upon successful upload and analysis based on the data type. The CosmosID-HUB team will assign you with the correct number of credits according to your subscription terms.

See the chart below for the amount of credits for different analyses.

TaxaFunctionAMR and Virulence Marker
WGS Deep442
Amplicon 16S1N/AN/A
Amplicon ITS1N/AN/A

6. Click "Start Upload" to upload your samples

Once uploaded, you can view the status of your analysis using the status indicator next to the samples in the Cohorts and Metadata Menu.