NCBI SRA Import
The Sequence Read Archive (SRA) is a publicly accessible database of raw sequencing data hosted NCBI. It contains high-throughput sequencing data from numerous research projects worldwide, covering a wide range of study types and organisms, including metagenomic and microbiome studies. Researchers deposit their sequencing reads in SRA to make data publicly available in accordance with scientific publication practices, fostering transparency and reproducibility in science. This is typically referenced in the paper with a BioProject number in the Data Availability section.
Why Import SRA Data for Microbiome Analysis?
Access to Publicly Available Data: Researchers can leverage publicly available SRA datasets to replicate studies, validate findings, or perform meta-analyses against their own data.
Cost Efficiency: Using existing SRA datasets reduces the need for sequencing of new samples when performing secondary analyses or meta-analyses. By importing previously published data, users can rapidly evaluate different datasets or integrate multiple studies without the need to collect and sequence new samples.
Extended Contextual Analysis: By analyzing previously published microbiome datasets from SRA, researchers can add context to their own data. This can provide valuable insights by comparing microbiomes across different populations, environments, diseases, or experimental conditions.
Data Reuse for Novel Insights: Many research questions can be answered by revisiting older datasets with new hypotheses or more advanced analysis techniques. The CosmosID-HUB allows users to explore data from fresh perspectives using its comprehensive suite of microbiome analysis tools, potentially leading to novel discoveries.
To Get Started:
1. Navigate to the SRA Upload tool within the Upload Menu
From anywhere on the app, you can expand the navigation menu by clicking on the hamburger icon on the upper left side of the screen and click "Upload" to navigate to the upload page.
Click "NCBI SRA" to launch input your SRA Accession Numbers
2. Input your SRA Accession Numbers
SRA samples from Shotgun Metagenomic, 16S, or ITS samples can be input using SRR/DRR/ERR Accessions.
Input up to 500 accessions (separated by comma, space, or semicolon) and click "Search". The tool will find the data files (either single-end or paired-end) and report if any are unavailable.
3. Select your desired folder location and data type
NOTE: Folders must be pre-generated in the "Cohorts and Metadata" menu before uploading samples. You can also upload to the /Home folder and move the samples into subfolders after upload.
You are required to select the TYPE of data from the following options:
- Shotgun Metagenomics - using whole genome shotgun sequencing, the CosmosID algorithms identify microorganisms based on entire genomes represented in our database and in your samples
- Amplicon 16S - the amplicon 16S is for bacterial 16S identification. Unlike shotgun metagenomics, amplicon 16S analysis looks only at the relevant bacterial 16S rRNA genes, not the entire genome for identification. This option requires the primer sequences used for generating libraries (e.g,. V1V2, V3V4, V4).
- You can select from standard primer sequences or input your own primer sequence
- Amplicon ITS - the amplicon ITS database is for fungal ITS identification.
3. Remove host reads
Selecting the sample host can remove host-reads for increased precision and accuracy in microbiome analysis, especially for functional profiling. You can select from various hosts, from humans to domestic cats!
If running CHAMP Human Profiler:
Select "None" as the host when running the CHAMP Human Microbiome Profiler. CHAMP has a native host-removal algorithm for removing human host reads, and selecting a host for this pipeline will result in the sample analysis failing.
4. Upload your Metadata (Optional)
Upload your metadata through a CSV template. This can also be skipped and performed after data upload.
5. Select your workflow
The HUB has several workflows to chose from when uploading your samples:
Shotgun Metagenomic Data
CHAMP™ Human Taxonomic & Functional Microbiome Profiling
Kepler Host-Agnostic Taxonomic Microbiome Profiling
Host-Agnostic Functional Profiling
AMR and Virulence Markers (module required)
Amplicon Data
ASV 16S rRNA Amplicon Profiling
OTU ITS rRNA Amplicon Profiling
Credit charges are assessed upon successful upload and analysis based on the data type. The CosmosID-HUB team will assign you with the correct number of credits according to your subscription terms.
See the chart below for the amount of credits for different analyses.
Taxa | Function | AMR and Virulence Marker | |
---|---|---|---|
WGS Deep | 4 | 4 | 2 |
Amplicon 16S | 1 | N/A | N/A |
Amplicon ITS | 1 | N/A | N/A |
6. Click "Start Upload" to upload your samples
Once uploaded, you can view the status of your analysis using the status indicator next to the samples in the Cohorts and Metadata Menu.
Updated about 1 month ago