Alpha diversity is all about the variety within a single sample. Think: how many different species are there (richness)? And how evenly are those species spread out (evenness)? There are a few ways to measure this, depending on what you’re looking for.
Typically, we recommend report results for all calculated metrics (e.g., “CHAO1: p=0.002, Shannon: p=0.042, Simpson: p=0.067”) as they may show different patterns depending on the underlying community changes.
There are several metrics in which diversity is measured. The Cosmos-Hub features the most common in microbiome research:
Chao1 Index (Chao, A. (1987). “Estimating the population size for capture-recapture data with unequal catchability.” Biometrics 43(4): 783–791.+
Shannon-Weaver Index (Shannon, C. E. (1948). “A mathematical theory of communication.” Bell System Technical Journal, 27, 379–423 & 623–656.)
Simpson Index (Simpson, E. H. (1949). “Measurement of diversity.” Nature, 163(4148), 688.)
❓“I want to know how many species are really there (even the rare ones).”
➡ Use: CHAO1 Index
- Best when your sample has lots of low-abundance organisms (like in microbiome studies).
- Estimates the total number of species by using the pattern of rare species detection (singletons and doubletons).
- Useful when you suspect that sequencing depth limitations prevented detection of some species.
- Based on capture-recapture principles—doesn’t assume any particular statistical distribution for your data.
Takeaway: CHAO1 is your go-to when you’re trying to estimate the true number of species, including the ones that are too rare to be detected reliably in your sample.
❓“I care about both how many species there are and how evenly they’re spread.”
➡ Use: Shannon Index
- A balanced metric that considers both richness (how many) and evenness (how equal).
- Measures the “uncertainty” in predicting species identity when randomly selecting an individual.
- Sensitive to both common and rare species in the community.
- Higher values indicate greater diversity through either more species or more even distribution.
Takeaway: Shannon Index gives you a comprehensive picture of community complexity. If your sample has 10 species but one makes up 90% of the total, the Shannon score will reflect that imbalance and show lower diversity.
❓“I’m mostly interested in the dominant players.”
➡ Use: Simpson Index
- Focuses heavily on the most abundant species in the community.
- Measures the probability that two randomly selected individuals belong to different species.
- Less sensitive to rare or low-abundance species compared to Shannon.
- Excellent for detecting changes in community dominance structure.
Takeaway: Simpson Index is great when you’re interested in who’s really running the show in your community and how dominance patterns change between samples.
🧭 So…which one should I use?
Best practice is to calculate and report all three metrics, as they capture different aspects of diversity. However, if focusing on one metric for your analysis:
| Goal | Best Metric | Why |
|---|
| Estimate total species richness | CHAO1 | Accounts for undetected rare species |
| Comprehensive diversity assessment | Shannon Index | Balances richness and evenness |
| Focus on dominance patterns | Simpson Index | Emphasizes abundant species |
| Compare across studies | Use combination | Different metrics reveal different patterns |
📈 What About Statistical Testing?
Alpha diversity metrics can be statistically compared between groups using non-parametric tests:
Wilcoxon Rank-Sum Test (Mann-Whitney U): For comparing two groups
- Tests whether the distributions of diversity values differ significantly between groups
- Appropriate when data may not be normally distributed
Kruskal-Wallis Test: For comparing more than two groups
- Non-parametric equivalent of one-way ANOVA
- Follow with post-hoc tests if significant differences are found
Important Considerations:
- Always visualize your data first (boxplots, violin plots)
- Check for outliers that might influence results
- Consider multiple testing corrections when comparing many groups
- Report effect sizes alongside p-values when possible