> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cosmosid.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Alpha Diversity Metrics: How to Choose the Right One

Alpha diversity is all about the variety **within** a single sample. Think: how many different species are there (richness)? And how evenly are those species spread out (evenness)? There are a few ways to measure this, depending on what you're looking for.

<Tip>
  Typically, **we recommend report results for all calculated metrics** (e.g., "CHAO1: p=0.002, Shannon: p=0.042, Simpson: p=0.067") as they may show different patterns depending on the underlying community changes.
</Tip>

There are several metrics in which diversity is measured. The Cosmos-Hub features the most common in microbiome research:

**Chao1 Index** (Chao, A. (1987). “Estimating the population size for capture-recapture data with unequal catchability.” Biometrics 43(4): 783–791.+

**Shannon-Weaver Index** (Shannon, C. E. (1948). “A mathematical theory of communication.” Bell System Technical Journal, 27, 379–423 & 623–656.)

**Simpson Index** (Simpson, E. H. (1949). “Measurement of diversity.” Nature, 163(4148), 688.)

## **❓"I want to know how many species are really there (even the rare ones)."**

➡ **Use: CHAO1 Index**

* Best when your sample has lots of low-abundance organisms (like in microbiome studies).
* Estimates the total number of species by using the pattern of rare species detection (singletons and doubletons).
* Useful when you suspect that sequencing depth limitations prevented detection of some species.
* Based on capture-recapture principles—doesn't assume any particular statistical distribution for your data.

**Takeaway:** CHAO1 is your go-to when you're trying to estimate the **true** number of species, including the ones that are too rare to be detected reliably in your sample.

## **❓"I care about both how many species there are and how evenly they're spread."**

➡ **Use: Shannon Index**

* A balanced metric that considers both richness (how many) and evenness (how equal).
* Measures the "uncertainty" in predicting species identity when randomly selecting an individual.
* Sensitive to both common and rare species in the community.
* Higher values indicate greater diversity through either more species or more even distribution.

**Takeaway:** Shannon Index gives you a comprehensive picture of community complexity. If your sample has 10 species but one makes up 90% of the total, the Shannon score will reflect that imbalance and show lower diversity.

## **❓"I'm mostly interested in the dominant players."**

➡ **Use: Simpson Index**

* Focuses heavily on the most abundant species in the community.
* Measures the probability that two randomly selected individuals belong to different species.
* Less sensitive to rare or low-abundance species compared to Shannon.
* Excellent for detecting changes in community dominance structure.

**Takeaway:** Simpson Index is great when you're interested in who's **really** running the show in your community and how dominance patterns change between samples.

## **🧭 So…which one should I use?**

Best practice is to calculate and report all three metrics, as they capture different aspects of diversity. However, if focusing on one metric for your analysis:

| **Goal**                           | **Best Metric** | **Why**                                     |
| :--------------------------------- | :-------------- | :------------------------------------------ |
| Estimate total species richness    | CHAO1           | Accounts for undetected rare species        |
| Comprehensive diversity assessment | Shannon Index   | Balances richness and evenness              |
| Focus on dominance patterns        | Simpson Index   | Emphasizes abundant species                 |
| Compare across studies             | Use combination | Different metrics reveal different patterns |

## **📈 What About Statistical Testing?**

Alpha diversity metrics can be statistically compared between groups using **non-parametric tests**:

**Wilcoxon Rank-Sum Test (Mann-Whitney U):** For comparing two groups

* Tests whether the distributions of diversity values differ significantly between groups
* Appropriate when data may not be normally distributed

**Kruskal-Wallis Test:** For comparing more than two groups

* Non-parametric equivalent of one-way ANOVA
* Follow with post-hoc tests if significant differences are found

**Important Considerations:**

* Always visualize your data first (boxplots, violin plots)
* Check for outliers that might influence results
* Consider multiple testing corrections when comparing many groups
* Report effect sizes alongside p-values when possible
