Microglia Cell proportions

Published

March 18, 2026

Code
# BiocManager::install("speckle")
# https://bioconductor.org/packages/devel/bioc/vignettes/speckle/inst/doc/speckle.html
library(qs2)
library(Seurat)
library(tidyverse) 
library(speckle)
library(kableExtra)

immune <- UpdateSeuratObject(qs_read("seurat_objects/20260225-mg_kai.qs2"))
Idents(immune) <- "sub.cluster"
# Order samples by Treatment.Group so groups are adjacent
immune$sample_ID <- factor(
  immune$sample_ID,
  levels = c("KK4_465", "KK4_504", "KK4_496", "KK4_492", "KK4_502", "KK4_464"))
Sample ID Treatment Group
KK4_465 IgG
KK4_504 IgG
KK4_496 IgG
KK4_492 Adu
KK4_502 Adu
KK4_464 Adu
Code
p <- plotCellTypeProps(immune,
                  clusters = Idents(immune),
                  sample = immune$sample_ID)

# Enforce sample_ID factor order on x-axis
sample_order <- c("KK4_465", "KK4_504", "KK4_496", "KK4_492", "KK4_502", "KK4_464")
p + scale_x_discrete(limits = sample_order) +
  theme_minimal() +
  theme(axis.title = element_text(size = 14),
        axis.text.x = element_text(size = 13, angle = 30, hjust = 1),
        axis.text.y = element_text(size = 13),
        legend.text = element_text(size = 14))

The function propeller from the speckl package tests whether cell-type proportions differ between experimental conditions, while properly accounting for biological replication and sample-to-sample variability.

As a result, propeller identifies true compositional changes in cell populations and avoids false positives driven by uneven cell capture or outlier samples.

Transform In propeller, transform refers to converting raw cell-type proportions into a scale where statistical testing is valid. Proportions are bounded between 0 and 1 and have unequal variance, so transforming them stabilizes variance and allows linear models to be used appropriately.

Logit The logit transform maps proportions from [0,1] to \((-\infty, +\infty)\) using

\(\log\left(\frac{p}{1-p}\right)\)

This reduces heteroskedasticity, improves power to detect differences in cell-type abundance, and makes effects interpretable as differences in log-odds, while small offsets are used to handle zeros. ***

Robust In propeller, robust refers to using robust empirical Bayes variance estimation in the linear model. This down-weights the influence of outlier samples with extreme cell-type proportions, preventing them from artificially inflating significance.

In practice, robust = TRUE makes the test more resistant to technical or biological outliers while preserving group means, leading to more reliable inference on cell-type composition differences.

Test for differences in immune cell type proportions between treatment groups in total brain

Code
# The propeller function can take a SingleCellExperiment object or Seurat object as input 
# and extract the three necessary  pieces of information from the cell information stored in colData.
#  The three essential pieces of information are

# cluster (Idents function by default)
# sample
# group

prop <- propeller(
  x = immune,
  clusters = Idents(immune),
  sample = immune$sample_ID,
  group = immune$Treatment.Group,
  trend = FALSE,
  robust = TRUE,
  transform = "logit")

prop |> as_tibble () |> 
  kbl(digits = 3) |> 
  kable_styling("striped")
BaselineProp.clusters BaselineProp.Freq PropMean.Adu PropMean.IgG PropRatio Tstatistic P.Value FDR
Metabolic 0.203 0.179 0.229 0.783 -2.291 0.051 0.564
Il1b+ 0.014 0.017 0.012 1.387 1.385 0.205 0.923
ARM 0.054 0.063 0.045 1.390 1.240 0.252 0.923
DAM_1 0.068 0.071 0.065 1.093 0.725 0.489 0.949
Hm1 0.094 0.089 0.099 0.908 -0.658 0.529 0.949
DAM_0 0.247 0.253 0.236 1.073 0.648 0.536 0.949
IRM 0.017 0.018 0.016 1.153 0.541 0.604 0.949
Hm2 0.131 0.133 0.130 1.028 0.259 0.803 0.953
Spp1+ 0.080 0.085 0.079 1.086 0.227 0.827 0.953
Proliferating 0.006 0.006 0.006 0.974 -0.167 0.872 0.953
Gpnmb+ 0.084 0.086 0.085 1.008 0.061 0.953 0.953

Interpretation

Column Brief explanation
BaselineProp.clusters Cell type or cluster being tested for proportional differences between conditions.
BaselineProp.Freq Overall mean proportion of the cluster across all samples, independent of condition.
PropMean.Adu Mean sample-level proportion of the cluster in the Adu group.
PropMean.IgG Mean sample-level proportion of the cluster in the IgG group.
PropRatio Ratio of mean proportions between groups (Adu / IgG); values >1 indicate enrichment in Adu, <1 depletion.
Tstatistic Moderated t-statistic testing whether transformed proportions differ between conditions; sign indicates direction.
P.Value Raw p-value for the difference in proportions for that cluster.
FDR False discovery rate–adjusted p-value accounting for testing across all clusters.

Immune cell type proportions by brain region

Code
md <- immune@meta.data |> 
  group_by(Idents(immune),Region) |> 
  summarise(n.cells = n()) |> 
  pivot_wider(names_from = Region, values_from = n.cells)

kbl(md, caption = "number of cells per cluster per region") |> 
  kable_styling("striped")
number of cells per cluster per region
Idents(immune) Cortex Hippocampus Hypothalamus Thalamus WM NA
DAM_0 1874 769 222 768 414 1637
Metabolic 1992 276 325 766 91 1223
DAM_1 534 161 104 373 78 323
Gpnmb+ 317 176 216 853 139 240
Hm2 410 313 544 710 533 508
Hm1 525 576 207 247 57 548
Spp1+ 400 199 259 598 130 264
Proliferating 41 21 10 32 12 23
ARM 384 153 64 129 45 467
IRM 143 41 22 83 34 75
Il1b+ 125 26 6 23 12 133
Code
# Wrap propeller so that regions where the model fails return NULL instead of an error
safe_propeller <- possibly(propeller, otherwise = NULL)

# Get all annotated regions, dropping NA entries
regions <- unique(immune$Region) |> na.omit() |> as.character()

# Run propeller separately for each brain region
prop_by_region <- map(regions, \(reg) {
  # Subset to cells from this region only
  sub <- subset(immune, Region == reg)

  # Skip if either treatment group has < 2 samples (propeller requires replication)
  n_per_group <- table(
    distinct(sub@meta.data, sample_ID, Treatment.Group)$Treatment.Group
  )
  if (any(n_per_group < 2)) return(NULL)

  # Test for proportion differences between Adu and IgG within this region
  safe_propeller(
    x         = sub,
    clusters  = Idents(sub),
    sample    = sub$sample_ID,
    group     = sub$Treatment.Group,
    trend     = FALSE,
    robust    = TRUE,       # down-weights outlier samples
    transform = "logit"     # variance-stabilising transform for bounded proportions
  )
}) |>
  # Name list elements by region, then drop NULLs (skipped/failed regions)
  set_names(regions) |>
  compact()

# Combine per-region results into a single data frame with a Region column
prop_region_df <- imap(prop_by_region, \(tbl, reg) {
  as_tibble(tbl) |> mutate(Region = reg)
}) |>
  list_rbind()
Code
prop_heatmap <- prop_region_df |>
  mutate(
    # Convert PropRatio (Adu/IgG) to log2 scale; set Inf/NaN (zero in one group) to NA
    log2ratio = case_when(
      is.nan(log2(PropRatio)) | is.infinite(log2(PropRatio)) ~ NA_real_,
      TRUE ~ log2(PropRatio)
    ),
    # Overlay significance stars on coloured tiles, and × on grey (zero-proportion) tiles
    stars = case_when(
      is.na(log2ratio)   ~ "\u00d7",   # cross for 0-proportion tiles
      P.Value < 0.001    ~ "***",
      P.Value < 0.01     ~ "**",
      P.Value < 0.05     ~ "*",
      TRUE               ~ ""
    )
  )

# Order cell types by mean log2ratio across regions (Adu-enriched at top)
celltype_order <- prop_heatmap |>
  group_by(BaselineProp.clusters) |>
  summarise(mean_r = mean(log2ratio, na.rm = TRUE)) |>
  arrange(desc(mean_r)) |>
  pull(BaselineProp.clusters)

prop_heatmap <- prop_heatmap |>
  mutate(
    CellType  = factor(BaselineProp.clusters, levels = rev(celltype_order)),
    Region    = factor(Region),
    # Use a darker grey for × marks so they remain visible on the grey background
    cross_col = ifelse(is.na(log2ratio), "grey30", "grey15")
  )

# Symmetric colour scale limit (rounded up to nearest integer)
clim <- ceiling(max(abs(prop_heatmap$log2ratio), na.rm = TRUE))

ggplot(prop_heatmap, aes(x = Region, y = CellType, fill = log2ratio)) +
  # Tile fill encodes direction and magnitude of the proportion shift
  geom_tile(colour = "white", linewidth = 0.5) +
  coord_equal() +
  # Significance stars (or ×) overlaid on each tile
  geom_text(aes(label = stars, colour = cross_col), size = 10, vjust = 0.75) +
  # Pass colour strings directly without mapping to a scale
  scale_colour_identity() +
  # Diverging palette: red = Adu-enriched, blue = IgG-enriched, grey = undefined
  scale_fill_distiller(
    palette   = "RdBu",
    direction = -1,
    limits    = c(-clim, clim),
    na.value  = "grey88",
    name      = "log2(Adu/IgG)"
  ) +
  labs(
    title    = "Cell type proportions by brain region",
    subtitle = "Fill: log2(PropRatio Adu/IgG);  * p<0.05  ** p<0.01",
    x = NULL, y = NULL
  ) +
  theme_minimal(base_size = 13) +
  theme(
    axis.text.x     = element_text(angle = 40, hjust = 1, size = 18),
    axis.text.y     = element_text(size = 18),
    panel.grid      = element_blank(),
    legend.position = "right",
    plot.title      = element_text(face = "bold", size = 18),
    plot.subtitle   = element_text(size = 16, colour = "grey40")
  )

Code
iwalk(prop_by_region, \(tbl, reg) {
  cat("\n\n<details><summary>**", reg, "**</summary>\n\n")
  tbl |>
    as_tibble() |>
    arrange(FDR) |>
    kbl(digits = 3, caption = paste("Region:", reg)) |>
    kable_styling("striped", full_width = FALSE) |>
    print()
  cat("\n\n</details>\n\n")
})
** Thalamus **
Region: Thalamus
BaselineProp.clusters BaselineProp.Freq PropMean.Adu PropMean.IgG PropRatio Tstatistic P.Value FDR
Metabolic 0.167 0.149 0.197 0.753 -1.395 0.211 0.939
Spp1+ 0.131 0.144 0.117 1.231 0.975 0.366 0.939
DAM_1 0.081 0.090 0.077 1.174 0.894 0.405 0.939
Hm2 0.155 0.143 0.159 0.900 -0.600 0.570 0.939
Gpnmb+ 0.186 0.188 0.176 1.067 0.445 0.672 0.939
Hm1 0.054 0.054 0.057 0.943 -0.429 0.683 0.939
Proliferating 0.007 0.007 0.007 0.930 -0.406 0.699 0.939
ARM 0.028 0.031 0.023 1.376 0.237 0.821 0.939
Il1b+ 0.005 0.005 0.006 0.864 -0.202 0.847 0.939
IRM 0.018 0.021 0.015 1.466 0.192 0.854 0.939
DAM_0 0.168 0.168 0.166 1.010 0.063 0.952 0.952
** Hypothalamus **
Region: Hypothalamus
BaselineProp.clusters BaselineProp.Freq PropMean.Adu PropMean.IgG PropRatio Tstatistic P.Value FDR
Metabolic 0.164 0.113 0.231 0.486 -2.224 0.053 0.585
Spp1+ 0.131 0.141 0.095 1.487 1.279 0.235 0.635
Il1b+ 0.003 0.004 0.002 1.911 1.195 0.265 0.635
Proliferating 0.005 0.004 0.007 0.550 -1.179 0.271 0.635
Gpnmb+ 0.109 0.120 0.087 1.380 1.068 0.315 0.635
Hm1 0.105 0.089 0.135 0.659 -0.976 0.355 0.635
Hm2 0.275 0.309 0.251 1.231 0.801 0.444 0.635
IRM 0.011 0.012 0.007 1.584 0.771 0.462 0.635
DAM_1 0.053 0.053 0.049 1.081 0.361 0.727 0.888
ARM 0.032 0.039 0.025 1.569 0.137 0.894 0.984
DAM_0 0.112 0.116 0.110 1.054 0.004 0.997 0.997
** Hippocampus **
Region: Hippocampus
BaselineProp.clusters BaselineProp.Freq PropMean.Adu PropMean.IgG PropRatio Tstatistic P.Value FDR
IRM 0.015 0.022 0.006 3.855 3.956 0.006 0.064
Il1b+ 0.010 0.013 0.005 2.850 2.320 0.056 0.307
Metabolic 0.102 0.086 0.125 0.688 -1.740 0.126 0.461
DAM_1 0.059 0.067 0.050 1.357 1.350 0.219 0.555
Hm2 0.115 0.102 0.135 0.756 -1.248 0.252 0.555
Hm1 0.212 0.190 0.242 0.784 -0.871 0.414 0.670
DAM_0 0.284 0.304 0.260 1.168 0.846 0.426 0.670
Gpnmb+ 0.065 0.068 0.056 1.215 0.492 0.639 0.878
Proliferating 0.008 0.008 0.007 1.028 -0.269 0.796 0.941
Spp1+ 0.073 0.081 0.064 1.264 -0.112 0.914 0.941
ARM 0.056 0.059 0.051 1.168 0.077 0.941 0.941
** Cortex **
Region: Cortex
BaselineProp.clusters BaselineProp.Freq PropMean.Adu PropMean.IgG PropRatio Tstatistic P.Value FDR
ARM 0.057 0.073 0.042 1.734 2.255 0.043 0.473
Gpnmb+ 0.047 0.038 0.051 0.746 -1.137 0.277 0.956
Metabolic 0.295 0.278 0.330 0.842 -0.914 0.378 0.956
DAM_1 0.079 0.081 0.072 1.138 0.643 0.532 0.956
DAM_0 0.278 0.288 0.261 1.103 0.554 0.589 0.956
IRM 0.021 0.020 0.019 1.056 0.526 0.609 0.956
Proliferating 0.006 0.006 0.005 1.183 0.437 0.670 0.956
Spp1+ 0.059 0.055 0.058 0.947 -0.202 0.843 0.956
Il1b+ 0.019 0.021 0.018 1.142 0.178 0.862 0.956
Hm1 0.078 0.079 0.086 0.922 -0.128 0.901 0.956
Hm2 0.061 0.060 0.058 1.044 0.056 0.956 0.956
** WM **
Region: WM
BaselineProp.clusters BaselineProp.Freq PropMean.Adu PropMean.IgG PropRatio Tstatistic P.Value FDR
Metabolic 0.059 0.029 0.096 0.299 -3.325 0.003 0.032
IRM 0.022 0.028 0.015 1.916 1.615 0.120 0.658
ARM 0.029 0.024 0.054 0.441 -1.371 0.185 0.679
Gpnmb+ 0.090 0.089 0.056 1.596 1.095 0.285 0.730
Proliferating 0.008 0.006 0.011 0.580 -0.898 0.382 0.730
DAM_1 0.050 0.042 0.057 0.738 -0.861 0.398 0.730
Il1b+ 0.008 0.008 0.008 0.931 -0.640 0.529 0.831
Hm2 0.345 0.377 0.338 1.116 0.487 0.631 0.867
Spp1+ 0.084 0.104 0.092 1.137 0.246 0.808 0.892
DAM_0 0.268 0.252 0.236 1.066 0.138 0.892 0.892
Hm1 0.037 0.041 0.037 1.097 0.137 0.892 0.892