Guides
Guides

AMR/VF Profiling

The Antimicrobial Resistance (AMR) and Virulence Factor (VF) profiling pipeline in CosmosID-HUB leverages KEPLER to provide high-resolution, accurate detection and quantification of resistance and virulence genes from sequencing data. This pipeline uses a sophisticated k-mer-based algorithm and the Resfinder database to achieve rapid and reliable analysis.

Pipeline Overview

The KEPLER-AMR/VF Profiling pipeline leverages advanced k-mer-based algorithms and hierarchical data structures to deliver accurate, efficient, and high-resolution insights into antimicrobial resistance and virulence factors in metagenomic samples.

Database Construction:

The pipeline utilizes a curated nucleotide gene sequences from ResFinder [1] that has been split into smaller fragments called n-mers. The n-mers are categorized as:

  • Shared Biomarker Attributes: Homologous or overlapping sequences present across multiple AMR genes.
  • Unique Biomarker Attributes: Unique sequences specific to individual genes.

These biomarkers are organized into a hierarchical, tree-like data structure, with the backbone of the tree representing shared biomarkers between genes and the leaves representing individual genes with their unique biomarkers.

Query Phase:

During analysis, sequencing reads from the query sample are split into k-mer sets (a type of n-mer with fixed length). These k-mers are queried against the pre-computed database to find exact matches between query biomarkers and reference biomarkers. This approach avoids the need for genome assembly, making it computationally efficient and suitable for analyzing raw sequencing reads directly.

Abundance Estimation:

The algorithm calculates abundance by examining composite k-mer statistics and estimating coverage depth. A weighted biomarker score or abundance score is then assigned based on these calculations, reflecting the presence and prevalence of specific AMR genes or VFs in the sample.

AMR and VF features are reported as individual genes (with reference IDs where applicable) or stratified based on antimicrobial resistance classes.

Advantages of the KEPLER AMR/VF Profiler

Efficiency: The k-mer-based approach eliminates the need for genome assembly, reducing computational overhead while maintaining high sensitivity and specificity.

Resolution: The use of shared and unique biomarkers enables precise identification of AMR genes at a fine-grain level, even distinguishing between closely related genes.

Scalability: The tree-like data structure facilitates rapid querying and classification across large datasets.

Applications

The pipeline is particularly valuable for clinical metagenomics, microbiome research, and antimicrobial resistance surveillance. By integrating AMR/VF data with other microbiome profiling tools in the CosmosID-HUB, researchers can perform comprehensive analyses of microbial communities and their functional attributes.

Patented Technology

The core technology behind this pipeline is protected under several patents (US10108778B2, US20200294628A1, and ES2899879T3).


Methodology for manuscripts

Antimicrobial resistance genes (AMR) and virulence factors (VF) were classified using the Resfinder [1] AMR database, the VFDB [2] VF database, and the KEPLER-AMR/VF Profiler the within the CosmosID-HUB [3]. This pipeline utilizes a pre-computed database of curated nucleotide sequences, split into n-mers and organized into a hierarchical tree-like structure based on shared biomarker attributes (homologous shared/overlap sequences among AMR genes as the tree backbone) and unique biomarker attributes (unique sequence for respective genes as the tree leaves). Metagenomic sequencing reads were divided into k-mer sets and queried against this database for exact matches. Abundance is calculated using fine-grain composite k-mer statistics and coverage depth estimation, resulting in a weighted biomarker score (abundance score) that is translated to relative abundance.

References:

[1] Ferrer Florensa, A., Kaas, R. S., Clausen, P. T. L. C., Aytan-Aktug, D., & Aarestrup, F. M. (2022). ResFinder – an open online resource for identification of antimicrobial resistance genes in next-generation sequencing data and prediction of phenotypes from genotypes. Microbial Genomics, 8(1), 000748. https://doi.org/10.1099/mgen.0.000748

[2] Liu, B., Zheng, D., Zhou, S., Chen, L., & Yang, J. (2021). VFDB 2022: A general classification scheme for bacterial virulence factors. Nucleic Acids Research, 50(D1), D912–D917. https://doi.org/10.1093/nar/gkab1107

[3] CosmosID-HUB, www.cosmosidhub.com