Functional Host-Agnostic Profiling

Functional Workflow

Understanding the functional potential of a microbial community also allows testing of hypotheses to link or associate specific molecular or biochemical activities to environmental and health associated phenotypes. In order to aid scientists explore and investigate these hypotheses, we are pleased to introduce the Functional Workflow in Cosmos-Hub that leverages Enzyme Commission, MetaCyc Pathways, Pfam CAZy and GO Terms databases to characterize the functional potential of the microbiome community.

The single sample view of functional workflow entails the tabular view of all databases along with stacked bar chart and donut chart to aid in visual inspection of functional capabilities of the microbiome population.

Clicking on the first column for each respective functional databases will take you to that specific feature’s description on that respective database’s website.

Technical Appendix

FUNCTIONAL Workflow:

Initial QC, adapter trimming and preprocessing of metagenomic sequencing reads are done using BBduk (1). The quality controlled reads are then subjected to a translated search against a comprehensive and non-redundant protein sequence database, UniRef 90. The UniRef90 database, provided by UniProt (2), represents a clustering of all non-redundant protein sequences in UniProt, such that each sequence in a cluster aligns with 90% identity and 80% coverage of the longest sequence in the cluster. The mapping of metagenomic reads to gene sequences are weighted by mapping quality, coverage and gene sequence length to estimate community wide weighted gene family abundances as described by Franzosa et al (3). Gene families are then annotated to MetaCyc (4) reactions (Metabolic Enzymes) to reconstruct and quantify MetaCyc (4) metabolic pathways in the community as described by Franzosa et al (3). Furthermore, the UniRef_90 gene families are also regrouped to Enzyme Commission Enzymes, Pfam protein domains, CAZy enzymes and GO Terms in order to get an exhaustive overview of gene functions in the community. Lastly, to facilitate comparisons across multiple samples with different sequencing depths, the abundance values are normalized using Total-sum scaling (TSS) normalization to produce “Copies per million” (analogous to TPMs in RNA-Seq) units. References:

Bushnell, B. (2021). BBDuk Guide - DOE Joint Genome Institute. Retrieved 1 August 2021, from https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/bbduk-guide/
UniProt: the universal protein knowledgebase. (2016). Nucleic Acids Research, 45(D1), D158-D169. doi: 10.1093/nar/gkw1099
Franzosa, E., McIver, L., Rahnavard, G., Thompson, L., Schirmer, M., & Weingart, G. et al. (2018). Species-level functional profiling of metagenomes and metatranscriptomes. Nature Methods, 15(11), 962-968. doi: 10.1038/s41592-018-0176-y
Caspi, R., Foerster, H., Fulcher, C., Kaipa, P., Krummenacker, M., & Latendresse, M. et al. (2007). The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Research, 36(Database), D623-D631. doi: 10.1093/nar/gkm900
Carbon, S., Ireland, A., Mungall, C., Shu, S., Marshall, B., & Lewis, S. (2008). AmiGO: online access to ontology and annotation data. Bioinformatics, 25(2), 288-289. doi: 10.1093/bioinformatics/btn615

Using the Hub

Navigating the Sample Dashboard

Interpreting Profiling Results

Comparative Analysis and Statistics

Microbiome Profiling

Technical Appendix

Functional Host-Agnostic Profiling

Functional Workflow

Technical Appendix

FUNCTIONAL Workflow:

Using the Hub

Navigating the Sample Dashboard

Interpreting Profiling Results

Comparative Analysis and Statistics

Microbiome Profiling

Technical Appendix

​Functional Workflow

​Technical Appendix

​FUNCTIONAL Workflow:

Functional Workflow

Technical Appendix

FUNCTIONAL Workflow: