A note about running MaAsLin3
Users are responsible for selecting parameters that best suit their study data and research objectives. While we provide basic recommendations, it is crucial to have a thorough understanding of the MaAsLin3 tool and the statistical methods employed to ensure accurate and reproducible results.
We strongly encourage users to familiarize themselves with the tool’s capabilities and limitations to make informed decisions regarding their analysis settings.
Please note that (Beta) indicates the status of the MaAsLin3 pipeline, which is currently in beta stages upon the publish of their pre-print article.
Before running MaAsLin3, you will need to process your .fastq files through one of the taxonomic or functional workflows available within the Cosmos-Hub. You will also need to add metadata to each of your samples to define cohorts needed for comparative analysis. Ensure your metadata encompasses all variables that you want to measure and that may affect your results, with clear groupings for main comparative variables.
After profiling, select the samples that you would like to compare in the “Cohorts and Metadata” menu and click “Create CA” to initiate the configuration of your parameters.
Select one of the following workflows depending on your input data:
This section outlines suggested settings for MaAsLin3 to optimize differential abundance analysis. Proper configuration ensures accurate results and helps address specific research questions effectively. Use these guidelines to select appropriate fixed and random effects, normalization techniques, and statistical thresholds tailored to your study design. Adjusting these parameters based on your dataset’s characteristics and research objectives will enhance the reliability and interpretability of your findings.
Kepler-Taxa | HostAgnostic Functional | CHAMP-Taxa | CHAMP-Functional | |
---|---|---|---|---|
Metric | Relative Abundance | CPM | Relative Abundance | Cellular Abundance |
Taxonomic Level | Species | NA | Species | NA |
Fixed Effect | Metadata of Interest | Metadata of Interest | Metadata of Interest | Metadata of Interest |
Random Effect | OptionalVariables | OptionalVariables | OptionalVariables | OptionalVariables |
Longitudinal and Complex Association Formula (Optional) | See page | See page | See page | See page |
Min. Abundance* | 0 | 1 | 0 | 0.1 |
Min. Prevalence** | 0 | 0.10 | 0 | 0.25 |
Zero Threshold | 0 | 0 | 0 | 0 |
Minimum Variance | 0 | 0 | 0 | 0 |
Q Threshold | 0.25 | 0.25 | 0.25 | 0.01 |
Normalization | TSS | TSS | TSS | NONE |
Transformation | LOG | LOG | LOG | LOG |
Multiple Correction | BH | BH | BH | BH |
Standardization | ||||
Augmentation | ||||
Median Comparison Abundance Compositional Correction | ||||
Median Comparison Abundance Threshold | 0 | 0 | 0 | 0 |
Median Comparison Prevalence Compositional Correction | 0 | 0 | 0 | 0 |
Subtract Median | FALSE | FALSE | FALSE | FALSE |
Maximum Number of Associations to Plot | 50 | 50 | 50 | 50 |
Number of Features in Summary Plot | 25 | 25 | 25 | 25 |
*A single hit can always be a sequencing error, contamination error, or other noise variables etc. Implementing a minimum abundance with minimum prevalence cutoff can ensure more accurate multi-variate association analysis data.
**Samples are far richer in functional hits than in taxonomic hits. Non-prevalent functions can sharply increase computational time and may include false positives, which is why more stringent cutoffs are recommended.
Building the MaAsLin3 model is dependent upon fixed and random effects that can be input as metadata variables (added using the “+ADD” button). Fixed effects are required, while random effects are optional (but recommended for when relevant metadata is available).
These variables define how samples are grouped into comparative cohorts, directly impacting the results. Fixed effects are the primary variables of interest, such as treatment groups or time points, while random effects account for potential confounding factors, like number of reads, age, or grouping variables. Properly defining these effects ensures that your analysis accurately reflects the biological questions you are exploring and minimizes biases in the model.
Fixed effects are metadata variables that relate to your hypothesis. These are the primary variables that you are interested in obtaining specific comparative results against. A fixed effect must have 2 or more groups for comparison (we recommend a max of 5 groups).
For example, a fixed effect could be:
Multiple fixed effects can be selected for comparisons across multiple metadata variables. Results will be presented in the analysis results for every variable/level.
Selecting Effect Variable Parameters (Data Type + Reference Value)
When defining fixed and random effects, it’s essential to select appropriate data types and reference values.
Fixed effects should include categorical variables with clear groups or continuous variables. For categorical data, specify a reference value (if applicable) to serve as a baseline for comparison (e.g., day0, control, healthy). Should a categorical data type be selected without a reference value, MaAsLin3 by default sets the first category in alphabetical order as the reference.
Random effects account for variability due to confounding factors (e.g., subject ID in longitudinal studies, age, cage number). Correctly setting these parameters ensures robust statistical modeling and accurate interpretation of results.
Another feature of MaAsLin3 is the ability to test for level-versus-level differences using ordered predictors and contrast tests. Ordered predictors are categorical variables with a natural ordering such as cancer stage, consumption frequency of a dietary factor, or dosage group.
Say you want to test all fixed effects groups against one another, rather than to the reference value. For example, you want to test Drug Dosage across all levels [25ng vs. 50ng, 50ng vs. 100ng, 25ng vs. 100ng] instead of against baseline [25ng vs. 50ng, 25ng vs. 100ng – this is the default workflow when selecting the reference value].
To do so, you will have to incorperate “ordered(DrugDosage)” into the Longitudinal and Complex Association Formula, and you can read how to do so here. This will perform a contrast test for whether there is a difference between each pair of subsequent levels. The coefficient, standard error, and p-value all correspond to the difference between the level in value and the previous level. Ordered predictors should only be included as fixed effects (i.e., no ordered predictors as random effects etc.).
Implementing random effects with fewer than 5 observations per group may produce poor model fits and subsequently no significant associations.
Random effects are metadata variables that may not be interested in evaluating but may present variability in the study data. Use these to account for confounding variables that might introduce noise, such as subject-specific differences (e.g., BMI, age). This helps in minimizing bias from uncontrolled factors. Only include relevant confounders to prevent model overfitting.
For example, a random effect could be:
Adding Number of Reads as a Random Effect
Because MaAsLin 3 identifies prevalence (presence/absence) associations, sample read depth (number of reads) should be included. Deeper sequencing will likely increase feature detection in a way that could spuriously correlate with metadata of interest when read depth is not included in the model.
Advanced modeling with Longitudinal and Complex Association Formula
You can also utilize the optional Longitudinal and Complex Association Formula to incorporate advanced metadata groupings, such as:
To learn more, consult the Longitudinal and Complex Associations with MaAsLin3 documentation.
Options include: Phylum, Class, Order, Family, Genus, Species (recommended), Strain
Select the desired phylogenetic level of calls for differential abundance analysis.
Recommended: Species
Options for taxonomic data:
Options for functional data:
Indicates the version of analysis pipeline for generated data. If only 1 version of the pipeline has been run, there will only be one option selected by default. We recommend running analysis with the most up-to-date version of your pipeline.
See dedicated documentation page linked above.
Features with abundances of at least in of the samples will be included for analysis. The threshold is applied before normalization and transformation.
Set thresholds to filter low-abundance features, reducing noise and focusing on biologically relevant data.
Default: 0.001
The minimum percent of samples for which a feature is detected at minimum abundance. Also see “Minimum Abundance” description above.
This parameter helps filter out rare features that may not provide meaningful insights by defining the minimum proportion of samples in which a feature must be present to be included in the analysis. Selecting an appropriate threshold reduces noise and focuses on features that are consistently observed across the study cohort. Typically, a threshold of 5% for taxonomic analysis and 25% for functional analysis is recommended, but this can vary based on the study design and specific research questions.
Prevalence and abundance filtering in MaAsLin3
Typically, it only makes sense to test for feature-metadata associations if a feature is non-zero “enough” of the time. “Enough” can vary between studies, but a 10-50% minimum prevalence threshold is not unusual (and up to 70-90% can be reasonable). Selecting a minimum prevalence filter of 5% will test only features with at least 5% non-zero values.
Similarly, it’s often desirable to test only features that reach a minimum abundance threshold in at least this many samples. By default, MaAsLin3 will consider any non-zero value to be reliable, and if you’ve already done sufficient QC in your dataset, this is appropriate. However, if you’d like to filter more or be conservative, you can set a minimum abundance threshold like min_abundance = 0.001 to test only features reaching at least this (relative) abundance.
Default: 0
Abundances less than or equal to zero_threshold will be treated as zeros. This is primarily to be used when the abundance table has likely low-abundance false positives.
Default: 0
Features with abundance variances less than or equal to Minimum Variance will be dropped. This is primarily used for dropping features that are entirely zero.
The maximum q-value threshold the be considered significant.
The Q value threshold controls the false discovery rate, helping to reduce false positives in multiple comparisons. A typical threshold is ≤ 0.1, but this can vary depending on the dataset (0.05-0.25). Adjust based on your study’s balance between identifying true associations and minimizing errors.
Different normalization techniques adjust for varying sequencing depths or compositional biases:
The transformation to apply to the features after normalization and before analysis. The option LOG is recommended, but PLOG (pseudo-log with a pseudo-count of half the dataset minimum non-zero abundance replacing zeros, particularly for metabolomics data) and NONE can also be used.
Choose transformations depending on the data distribution:
Correction for multiple testing is a statistical method used to reduce the likelihood of false positive results when performing multiple comparisons or tests. This produces a q value (or False Discovery Rate, FDR), which complements the corresponding p value. When many statistical tests are conducted simultaneously, the chance of finding a significant result purely by chance increases. To address this, correction methods adjust the significance levels to control the overall error rate.
Options include:
Apply z-score standardization so continuous metadata are on the same scale.
Standardizing a numeric or continuous metadata variable involves transforming the values to have a mean of zero and a standard deviation of one. This process, also known as z-score normalization, allows variables with different units or scales to be compared directly. Standardization is crucial when variables differ significantly in their ranges or units, ensuring that each variable contributes equally to the model and preventing any one variable from disproportionately influencing the analysis. It is commonly used in linear models and clustering algorithms.
Why standardize your numeric variables?
Suppose you have a microbiome dataset where you want to analyze the impact of participants’ age and BMI on microbial abundance. Since age and BMI have different scales, standardizing these variables (e.g., converting age from years and BMI from kg/m² to z-scores) ensures that both contribute equally to the analysis. This prevents the model from being disproportionately influenced by the variable with the larger numerical range, allowing for more balanced and interpretable results.
To avoid linear separability in the logistic regression, at each input data point, add an extra 0 and an extra 1 observation weighted as the number of predictors divided by two times the number of data points. This is almost always recommended to avoid linear separability while having a minor effect on fit coefficients otherwise.
When “Median Comparison Abundance Compositional Correction” or “Median Comparison Prevalence Compositional Correction” is checked on, the coefficients for a metadatum will be tested against the median coefficient for that metadatum (median across the features). Otherwise, the coefficients will be tested against 0.
For abundance associations, this is designed to account for compositionality: the idea that if only one feature has a positive association with a metadatum on the absolute scale (cell count), the other features will have apparent negative associations with that metadatum on the relative scale (proportion of the community) because relative abundances must sum to 1.
More generally, associations on the relative scale are not necessarily the same as the associations on the absolute scale in magnitude or sign, so testing against zero on the relative scale is not equivalent to testing against zero on the absolute scale. When testing associations on the relative scale, the coefficients should be tested against 0 (median comparison off). However, since these tests do not correspond to tests for associations on the absolute scale, we instead use a test against the median, which can enable some inference on the absolute scale.
There are two interpretations of this test for absolute abundance associations:
By contrast, sparsity should be less affected by compositionality since a feature should still be present even if another increases or decreases in abundance. (Note that, because the read depth is finite, this might not always be true in practice.) Therefore, Median ComparisonPrevalence Compositional Corrections is off by default, but it can be turned on if the user is interested in testing whether a particular prevalence association is significantly different from the typical prevalence association.
In both cases, if the tested coefficient is within Median Comparison Abundance/Prevalence Threshold of the median, it will automatically receive a p-value of 1. This is based on the idea that the association might be statistically significantly different but not substantially different from the median and therefore is likely still a result of compositionality.
To conclude:
Number of significant associations to plot in the volcano plots
Recommended: 50
Number of features to plot in the summary plot/heatmap
Recommended: 25