Spatial co-expression modules

Module detection, plaque-type association, treatment effects, and module characterization

Published

March 27, 2026

Background

Genes that are individually spatially autocorrelated often co-vary across space in coherent groups, reflecting shared regulatory context: cell-type co-localisation, shared niche signals, or co-expressed gene programs. Detecting such groups requires computing gene–gene spatial correlations that account for the physical layout of cells in tissue, rather than simple Pearson or Spearman correlations across cells.

We implemented this analysis using the Giotto Suite package [@dries2021giotto], which provides a dedicated spatial co-expression workflow built on three functions:

detectSpatialCorFeats() computes spatially smoothed gene–gene correlations by averaging expression over each cell’s neighbourhood on the Delaunay triangulation network, then computing pairwise correlations on those smoothed profiles — a process that weights co-expression by spatial proximity.
clusterSpatialCorFeats() groups genes into co-expression modules by hierarchical clustering on the resulting gene–gene correlation matrix.
createMetafeats() is then intended to summarise each module as a single per-cell metagene score (the mean normalised expression of the module’s member genes), which can be related back to spatial coordinates, cell-type annotations, and plaque distances.

This notebook addresses three questions:

Plaque-type association — Are spatial programs associated specifically with CAA (vascular amyloid), parenchymal plaques, or both?
Treatment effect — Do Aducanumab-treated brains show different module–plaque associations compared with IgG controls?
Module characterisation — What genes define each module, and which cell types drive it?

Giotto issues

Giotto’s spatial co-expression workflow is partially broken across its v1 → v2 → GiottoClass/v3 migration. The code in this notebook deviates from the documented API in several places. This section records every workaround so that future debugging is not done from scratch.

Function	Documented behaviour	Actual behaviour	Resolution
`detectSpatialCorFeats`	Returns updated giotto object with spatCorObject stored inside	Returns a bare `spatCorObject`; assigning back to the giotto variable destroys it	Store return in a separate variable (`spat_cor1`); keep giotto object untouched
`clusterSpatialCorFeats`	Called on the giotto object	Must be called on the `spatCorObject` directly	Pass `spat_cor1`, not `giotto1`
`createMetafeats` — `spat_cor_matrix` arg	`spat_cor_matrix = spat_cor1` passes the spatCorObject	Parameter does not exist → unused argument error	Removed; cluster vector passed directly
`createMetafeats` — `feat_clusters` arg	`feat_clusters = "spat_cor_feats"` (string name)	String lookup fails; requires the actual named integer vector	Extract manually: `spat_cor1$cor_clusters$spat_cor_feats`
`getSpatialCorFeats`	Retrieves spatCorObject from within the giotto object	Fails — spatCorObject is never stored inside the giotto object	Abandoned; `spat_cor1` saved to disk separately and loaded directly
`getExpression`	`getExpression(giotto1, ...)`	`future::getExpression` masks `Giotto::getExpression` when `future` is loaded, dispatching to the wrong method	Use explicit namespace: `Giotto::getExpression()`
Zero-variance gene filter	Not required by docs	`stats::cor()` returns NA for zero-variance genes → `hclust` crashes with NA/NaN/Inf in foreign function call	Filter genes with `apply(expr, 1, var) > 0` before calling `detectSpatialCorFeats`
`createMetafeats` — persistence	Metagene scores stored in expression or metadata slots of the giotto object	After saving and reloading the giotto object, neither `showGiottoExpression()` nor `getCellMetadata()` shows the scores — `feat_type = "spatial_modules"` does not exist	Recompute manually: `Matrix::colMeans` over normalised expression matrix for each module’s gene set
`getExpression` — `feat_type = "spatial_modules"`	`values = "normalized"` retrieves stored metagene scores	Fails: Normalized expression not found. First run normalization step — even `values = "raw"` fails because the feat_type was never written	Abandoned; metagene scores recomputed directly from expression matrix
`cor_DT$variable` type	Character gene names	Stored as factor or integer; direct string joins fail silently	Coerce before joining: `mutate(variable = as.character(variable))`
`as_tibble()` on transposed metagene matrix	Numeric column names preserved	R prefixes purely-numeric column names with `"V"` (e.g. module `1` becomes `"V1"`), breaking integer joins downstream	Strip prefix when needed: `as.integer(sub("^V", "", module))`

5 of 7 core functions or operations required undocumented workarounds. The spatCorObject internal structure ($cor_clusters$<name>) was reverse-engineered from str() output. The root cause is an incomplete API migration across Giotto versions with no deprecation warnings or updated vignettes.

Setup

Code

source("R/setup.R")
library(patchwork)
library(ggrepel)
library(Giotto)

data               <- load_slides_and_distances("full")
slide1             <- data$slide1
slide2             <- data$slide2
cell_distances_all <- data$distances_all

giotto1     <- qs2::qs_read("spatial/giotto1_modules.qs2")
giotto2     <- qs2::qs_read("spatial/giotto2_modules.qs2")
spat_cor1   <- qs2::qs_read("spatial/spat_cor1.qs2")
spat_cor2   <- qs2::qs_read("spatial/spat_cor2.qs2")
mod_unified <- read_csv("spatial/mod_unified.csv", show_col_types = FALSE)
cor_DT_avg  <- read_csv("spatial/cor_DT_avg.csv",  show_col_types = FALSE)

cor_any   <- qs2::qs_read("spatial/cor_any.qs2")
cor_caa   <- read_csv("spatial/cor_caa.csv",   show_col_types = FALSE)
cor_paren <- read_csv("spatial/cor_paren.csv", show_col_types = FALSE)

Code

## Module detection ##################################################################
# Run this chunk once to compute spatial correlations, cluster genes into modules, add metagene scores to the Giotto objects, and save all outputs to disk. All subsequent sections load from those saved files. See the Giotto API notes above for why each workaround is needed.
#######################################################################################

source("R/setup.R")
library(patchwork)
library(ggrepel)

# ── Slide 1 ──────────────────────────────────────────────────────────────────
giotto1     <- qs2::qs_read("spatial/giotto1.qs2")
expr1       <- Giotto::getExpression(giotto1, values = "normalized", output = "matrix")
# Zero-variance genes produce NA correlations → hclust crashes; remove them first.
keep_feats1 <- rownames(expr1)[apply(expr1, 1, var) > 0]

spat_cor1 <- detectSpatialCorFeats(
  giotto1,
  method               = "network",
  spatial_network_name = "delaunay",
  expression_values    = "normalized",
  subset_feats         = keep_feats1
)

# clusterSpatialCorFeats takes the spatCorObject, not the giotto object.
spat_cor1 <- clusterSpatialCorFeats(spat_cor1, name = "spat_cor_feats", k = 6)

# feat_clusters must be the actual named integer vector, not a string name.
giotto1 <- createMetafeats(giotto1,
                            feat_clusters = spat_cor1$cor_clusters$spat_cor_feats,
                            name = "spatial_modules")

# ── Slide 2 ──────────────────────────────────────────────────────────────────
giotto2     <- qs2::qs_read("spatial/giotto2.qs2")
expr2       <- Giotto::getExpression(giotto2, values = "normalized", output = "matrix")
keep_feats2 <- rownames(expr2)[apply(expr2, 1, var) > 0]

spat_cor2 <- detectSpatialCorFeats(
  giotto2,
  method               = "network",
  spatial_network_name = "delaunay",
  expression_values    = "normalized",
  subset_feats         = keep_feats2
)

spat_cor2 <- clusterSpatialCorFeats(spat_cor2, name = "spat_cor_feats", k = 6)

giotto2 <- createMetafeats(giotto2,
                            feat_clusters = spat_cor2$cor_clusters$spat_cor_feats,
                            name = "spatial_modules")

# ── Unified modules (average spatial correlations across both slides) ─────────
# Running detection independently per slide produces non-comparable module labels.
# We average the pairwise spatial correlation tables, recluster, and derive a
# single gene→module mapping used by all downstream sections.
cor_DT_avg <- full_join(
  as_tibble(spat_cor1$cor_DT) |> mutate(variable = as.character(variable)),
  as_tibble(spat_cor2$cor_DT) |> mutate(variable = as.character(variable)),
  by      = c("feat_ID", "variable"),
  suffix  = c("_1", "_2")
) |>
  mutate(spat_cor = rowMeans(cbind(spat_cor_1, spat_cor_2), na.rm = TRUE)) |>
  select(feat_ID, variable, spat_cor)

# Rebuild a symmetric gene×gene matrix and cluster with hclust (ward.D2).
all_genes   <- union(cor_DT_avg$feat_ID, cor_DT_avg$variable)
avg_mat     <- matrix(0, nrow = length(all_genes), ncol = length(all_genes),
                      dimnames = list(all_genes, all_genes))
diag(avg_mat) <- 1
for (i in seq_len(nrow(cor_DT_avg))) {
  g1 <- cor_DT_avg$feat_ID[i]; g2 <- cor_DT_avg$variable[i]
  v  <- cor_DT_avg$spat_cor[i]
  avg_mat[g1, g2] <- v; avg_mat[g2, g1] <- v
}
hc              <- hclust(as.dist(1 - avg_mat), method = "ward.D2")
unified_clusters <- cutree(hc, k = 6)

mod_unified <- tibble(
  gene   = names(unified_clusters),
  module = as.integer(unified_clusters)
)

# ── Save to disk ──────────────────────────────────────────────────────────────
qs2::qs_save(giotto1, "spatial/giotto1_modules.qs2")
qs2::qs_save(giotto2, "spatial/giotto2_modules.qs2")
# spatCorObjects are not stored inside giotto objects in this Giotto version;
# save them separately so downstream sections can load them directly.
qs2::qs_save(spat_cor1, "spatial/spat_cor1.qs2")
qs2::qs_save(spat_cor2, "spatial/spat_cor2.qs2")
write_csv(mod_unified, "spatial/mod_unified.csv")
write_csv(cor_DT_avg,  "spatial/cor_DT_avg.csv")

Module overview

Spatial correlations were computed per slide using each slide’s Delaunay triangulation network, then the pairwise gene–gene correlation tables were averaged across both slides to produce a single consensus correlation matrix. This pooling step ensures that module identities reflect spatial co-expression structure common to both tissue sections rather than being biased by the spatial layout of one slide.

Modules were then defined by Ward’s minimum-variance hierarchical clustering on this averaged matrix. Hierarchical clustering operates on a dissimilarity matrix derived from the correlations (distance = 1 − rho), and at each step merges the pair of clusters whose combination minimises the total within-cluster variance (Ward’s D2 criterion). The algorithm produces a full dendrogram — a nested tree of gene groupings ordered from the most tightly correlated pairs at the bottom to the full gene set at the top. Cutting the dendrogram at k = 6 partitions all genes into six groups such that genes within the same group have high average pairwise spatial correlation with each other, and low correlation with genes in other groups. The choice of k = 6 was made to yield biologically interpretable modules without over-fragmenting the panel; a larger k would split existing modules into sub-programs, while a smaller k would merge biologically distinct programs. The bar chart below shows how many genes are assigned to each module.

Code

mod_unified |>
  count(module) |>
  ggplot(aes(factor(module), n)) +
  geom_col(fill = "#1f77b4") +
  labs(x = "Module", y = "Number of genes",
       title = "Genes per unified spatial co-expression module") +
  theme_minimal(base_size = 11)

Hierarchical clustering on the slide-averaged spatial correlation matrix partitioned the 474-gene Xenium panel into six modules ranging in size from 41 to 155 genes (Modules 1 and 6 being the most compact, Module 2 the largest). Module sizes reflect the intrinsic co-expression topology of the tissue rather than an artefact of a single sample, since the correlation structure was averaged across both slides before clustering. The tight modules (1 and 6) contain genes with high pairwise spatial correlations (mean within-module rho > 0.45), indicative of strong spatial co-localisation, whereas the larger Module 2 has markedly lower within-module correlations (mean rho ≈ 0.05–0.07), suggesting a more diffuse or heterogeneous spatial pattern. Gene set identity of each module is discussed in the Module characterisation section below.

Plaque-type association

To determine whether spatial programs are specifically associated with CAA, parenchymal plaques, or both, we compute Spearman gene-distance correlations for dist_any, dist_caa, and dist_paren (all pre-computed once and loaded from disk above). The sign is flipped so rho_flip > 0 means enriched near that plaque type. We then overlay each gene’s module membership with its rho_flip for all three distance types.

Code

# This calculation is different from notebook 6's outputs
#The min_expressing_cells = 5 threshold is much lower
# this is intentional here because the co-expression
# module analysis needs rho values for as many genes as possible (including lowly expressed ones)
# to populate the module scatter plots. Notebook 6 uses 100 to keep only reliably expressed genes 
# for the ranked correlation analysis.            
cor_caa   <- compute_gene_distance_correlation(
  list(slide1, slide2), cell_distances_all, distance_col = "dist_caa",
  min_expressing_cells = 5)
cor_paren <- compute_gene_distance_correlation(
  list(slide1, slide2), cell_distances_all, distance_col = "dist_paren",
  min_expressing_cells = 5)

write_csv(cor_caa,   "spatial/cor_caa.csv")
write_csv(cor_paren, "spatial/cor_paren.csv")

Code

# Combine all three distance types into a single long table for faceting.
plaque_type_df <- bind_rows(
  mod_unified |> left_join(cor_any   |> select(gene, rho_flip), by = "gene") |> mutate(plaque = "Any"),
  mod_unified |> left_join(cor_caa   |> select(gene, rho_flip), by = "gene") |> mutate(plaque = "CAA"),
  mod_unified |> left_join(cor_paren |> select(gene, rho_flip), by = "gene") |> mutate(plaque = "Parenchymal")
) |>
  filter(!is.na(rho_flip)) |>
  mutate(plaque = factor(plaque, levels = c("Any", "CAA", "Parenchymal")))

ggplot(plaque_type_df,
       aes(x = factor(module), y = rho_flip, fill = plaque)) +
  geom_boxplot(outlier.size = 0.5, alpha = 0.75,
               position = position_dodge(0.8), width = 0.7) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "grey40") +
  scale_fill_manual(values = c(Any = "grey60", CAA = "#d62728",
                                Parenchymal = "#1f77b4")) +
  labs(
    title = "Plaque-type association by spatial co-expression module",
    subtitle = "rho_flip > 0: enriched near that plaque type",
    x = "Spatial module", y = "Spearman rho (flipped)", fill = "Plaque type"
  ) +
  theme_minimal(base_size = 12)

Code

plaque_type_df |>
  group_by(module, plaque) |>
  summarise(
    n_genes      = n(),
    median_rho   = median(rho_flip, na.rm = TRUE),
    pct_enriched = mean(rho_flip > 0, na.rm = TRUE) * 100,
    .groups = "drop"
  ) |>
  arrange(plaque, module) |>
  kbl(digits = 3,
      col.names = c("Module", "Plaque type", "N genes",
                    "Median rho_flip", "% enriched near")) |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE) |>
  collapse_rows(columns = 2, valign = "top")

Module	Plaque type	N genes	Median rho_flip	% enriched near
1	Any	41	0.021	68.293
2		146	-0.013	34.247
3		105	-0.074	1.905
4		53	-0.027	20.755
5		63	0.022	80.952
6		51	0.119	100.000
1	CAA	41	-0.052	9.756
2		155	-0.002	45.161
3		105	0.010	68.571
4		53	-0.007	35.849
5		63	0.026	92.063
6		51	-0.009	17.647
1	Parenchymal	41	0.031	68.293
2		155	-0.010	36.774
3		105	-0.060	4.762
4		53	-0.034	16.981
5		63	-0.008	39.683
6		51	0.107	98.039

The plaque-type association analysis reveals a clear functional dissociation between the two amyloid lesion subtypes.

Module 6, defined by canonical microglial genes (Cx3cr1, Trem2, C1qa/c, P2ry12, Hexb). displays the strongest spatial enrichment of any module near plaques (median rho_flip = 0.119 for any plaque), and this response is almost entirely attributable to parenchymal amyloid (rho_paren = 0.107) rather than CAA (rho_CAA = −0.009). Conversely, Module 5, characterized by vascular and extracellular matrix genes (Flt1, Mgp, Vtn, Col1a2, Timp3) together with the interferon-stimulated transmembrane proteins Ifitm2 and Ifitm3, shows a modest but consistent selectivity for CAA (rho_CAA = 0.026, rho_paren = −0.008). This vascular/ECM module likely reflects reactive changes in the mural cell and perivascular fibroblast compartment adjacent to amyloid-laden vessels, distinct from the parenchymal microglial response.

Modules 2, 3, and 4, representing neuronal and astrocytic programs, are all mildly depleted near plaques (rho_any range: −0.013 to −0.074), consistent with local loss of neuronal processes and synaptic terminals in the periplaque environment. Notably, Module 3 (synaptic/glutamatergic genes: Grin2a/b, Snap25, metabolic enzymes) shows the strongest depletion (rho_any = −0.074, rho_paren = −0.060), pointing specifically to loss of excitatory synaptic machinery near parenchymal plaques. Module 4, the astrocytic module (Gja1, Aqp4, Aldh1l1, Clu), is also modestly depleted (rho_paren = −0.034), suggesting that the canonical homeostatic astrocyte program, rather than reactive astrogliosis, is reduced in plaque-proximal tissue; the reactive astrocyte signature would require additional markers (e.g., Gfap, Vim) not fully captured by this module. Module 1 (oligodendrocyte/myelin: Mog, Mag, Aspa, Pllp) occupies an intermediate position, slightly enriched near parenchymal plaques (rho_paren = 0.031) but depleted near CAA (rho_CAA = −0.052), which may reflect myelin damage and oligodendrocyte vulnerability in the white-matter-adjacent parenchymal amyloid niche.

Treatment effect (Adu vs IgG)

To test whether Aducanumab reshapes the spatial transcriptional response to plaques, we repeat the gene-distance correlation after stratifying cells by treatment group. The cell_distances_all table retains Treatment.Group from the Seurat metadata. We compute cor_any separately for Adu and IgG cells, then compare rho_flip per module between treatments. A module with higher rho_flip in Adu than IgG is more strongly recruited near plaques after treatment; lower rho_flip in Adu suggests treatment attenuates the plaque-proximal program.

Code

dist_adu <- cell_distances_all |> filter(Treatment.Group == "Adu")
dist_igg <- cell_distances_all |> filter(Treatment.Group == "IgG")

# Filter seurat objects to matching cells before passing to the correlation function.
slide1_adu <- subset(slide1, cells = intersect(colnames(slide1), dist_adu$cell))
slide2_adu <- subset(slide2, cells = intersect(colnames(slide2), dist_adu$cell))
slide1_igg <- subset(slide1, cells = intersect(colnames(slide1), dist_igg$cell))
slide2_igg <- subset(slide2, cells = intersect(colnames(slide2), dist_igg$cell))

cor_adu <- compute_gene_distance_correlation(
  list(slide1_adu, slide2_adu), dist_adu, distance_col = "dist_any")
cor_igg <- compute_gene_distance_correlation(
  list(slide1_igg, slide2_igg), dist_igg, distance_col = "dist_any")

Code

treatment_df <- bind_rows(
  mod_unified |> left_join(cor_adu |> select(gene, rho_flip), by = "gene") |> mutate(treatment = "Adu"),
  mod_unified |> left_join(cor_igg |> select(gene, rho_flip), by = "gene") |> mutate(treatment = "IgG")) |>
  filter(!is.na(rho_flip)) |>
  mutate(treatment = factor(treatment, levels = c("IgG", "Adu")))

ggplot(treatment_df,
       aes(x = factor(module), y = rho_flip, fill = treatment)) +
  geom_boxplot(outlier.size = 0.5, alpha = 0.75,
               position = position_dodge(0.8), width = 0.7) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "grey40") +
  scale_fill_manual(values = c(IgG = "#7f7f7f", Adu = "#2ca02c")) +
  labs(
    title = "Module–plaque association by treatment",
    subtitle = "rho_flip: Spearman correlation with nearness to any plaque, split by treatment",
    x = "Spatial module", y = "Spearman rho (flipped)", fill = "Treatment") +
  theme_minimal(base_size = 11)

Code

# Delta rho per module: Adu − IgG. Positive = stronger plaque recruitment after treatment.
delta_df <- inner_join(
  cor_adu |> select(gene, rho_adu = rho_flip),
  cor_igg |> select(gene, rho_igg = rho_flip),
  by = "gene"
) |>
  mutate(delta_rho = rho_adu - rho_igg) |>
  inner_join(mod_unified |> select(gene, module), by = "gene")

delta_df |>
  group_by(module) |>
  summarise(
    n_genes    = n(),
    median_delta = median(delta_rho),
    .groups = "drop") |>
  kbl(digits = 3,
      col.names = c("Module", "N genes", "Median Δrho (Adu − IgG)")) |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE)

Module	N genes	Median Δrho (Adu − IgG)
1	41	0.008
2	146	0.006
3	105	-0.006
4	53	0.001
5	63	0.011
6	51	0.015

Code

ggplot(delta_df, aes(x = factor(module), y = delta_rho, fill = factor(module))) +
  geom_boxplot(outlier.size = 0.5, alpha = 0.75) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "grey40") +
  labs(
    title = "Treatment effect on module–plaque association (Adu − IgG)",
    subtitle = "Δrho > 0: module more strongly recruited near plaques in Adu-treated tissue",
    x = "Spatial module", y = "Δrho (Adu − IgG)") +
  theme_minimal(base_size = 12) +
  theme(legend.position = "none")

Aducanumab treatment did not substantially alter the spatial transcriptional organisation of any module relative to plaques.

Median Δrho (Adu − IgG) across all six modules ranged from −0.002 to +0.008, well below any threshold of biological relevance; the largest positive shift was seen in Module 6 (Δrho = +0.008, microglia/DAM), suggesting a marginal trend toward stronger microglial recruitment near residual plaques in treated animals, but this effect is negligible at the current sample size (n = 3 per group).

Module characterization

Top genes per module

For each module we list the genes with the highest within-module spatial correlation (i.e. the genes that are most tightly co-expressed with the rest of the module). These are the genes that most faithfully represent the spatial program.

Code

# Within-module gene ranking: for each gene, its mean spatial correlation with
# all other genes in the same module (from the averaged pairwise cor_DT table).
top_genes_per_module <- cor_DT_avg |>
  # Keep only within-module pairs
  inner_join(mod_unified |> rename(feat_ID = gene, mod_A = module), by = "feat_ID") |>
  inner_join(mod_unified |> rename(variable = gene, mod_B = module), by = "variable") |>
  filter(mod_A == mod_B, feat_ID != variable) |>
  group_by(module = mod_A, gene = feat_ID) |>
  summarise(mean_spat_cor = mean(spat_cor, na.rm = TRUE), .groups = "drop") |>
  arrange(module, desc(mean_spat_cor)) |>
  slice_max(mean_spat_cor, n = 10, by = module)

top_genes_per_module |>
  kbl(digits = 3,
      col.names = c("Module", "Gene", "Mean within-module spatial cor")) |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE) |>
  collapse_rows(columns = 1, valign = "top")

Module	Gene	Mean within-module spatial cor
1	Mog	0.484
	Cldn11	0.483
	Ermn	0.482
	Aspa	0.471
	Mag	0.470
	Gatm	0.456
	Apod	0.449
	Pllp	0.446
	Lap3	0.397
	Cers2	0.397
2	Calb1	0.070
	Dgkd	0.057
	Lamp5	0.055
	Cux2	0.052
	Lypd1	0.050
	Comt	0.050
	Jund	0.049
	Egr1	0.049
	Akt1	0.048
	Bhlhe40	0.048
3	Opcml	0.413
	Grin2b	0.412
	Got1	0.406
	Snap25	0.398
	Mdh1	0.396
	Pkm	0.388
	Grin2a	0.386
	Chst1	0.383
	Got2	0.381
	Pgam1	0.379
4	Gja1	0.352
	Aqp4	0.351
	Pla2g7	0.346
	Mt2	0.322
	Clu	0.319
	Prodh	0.286
	Cbs	0.281
	Plpp3	0.280
	Aldh1l1	0.277
	Gstm1	0.277
5	Ifitm3	0.327
	Ifitm2	0.294
	Vtn	0.281
	Bgn	0.279
	Vim	0.271
	Flt1	0.270
	Timp3	0.265
	Col1a2	0.261
	Clec2d	0.255
	Mgp	0.252
6	Cx3cr1	0.527
	C1qa	0.522
	C1qc	0.520
	Trem2	0.509
	Hexb	0.501
	Gpr34	0.500
	Bcl2a1b	0.492
	Ly86	0.491
	P2ry12	0.488
	Cd68	0.474

The top-gene lists permit biological annotation of all six modules:

Module 1 is an oligodendrocyte/myelin module: its highest-ranked genes; Mog (myelin oligodendrocyte glycoprotein), Mag (myelin-associated glycoprotein), Aspa (aspartoacylase), Pllp (plasma membrane proteolipid), and Cers2 (ceramide synthase 2),are collectively required for myelin synthesis and maintenance, and are exclusively expressed by mature oligodendrocytes. The exceptionally high within-module spatial correlations (mean rho > 0.45) reflect the tight spatial co-localisation of oligodendrocyte soma in white matter tracts. Apod (apolipoprotein D), also prominent in Module 1, is a lipid transport protein upregulated in oligodendrocytes under stress, suggesting the module captures a combined homeostatic-to-stressed oligodendrocyte continuum.

Module 2 is a broad neuronal module with low within-module coherence (mean rho ≈ 0.05), encompassing upper-layer cortical markers (Cux2, Calb1), calcium signalling (Dgkd), neurotransmitter metabolism (Comt), and immediate-early gene activity (Egr1, Jund). The large size (155 genes) and low pairwise correlations indicate this module captures a diffuse pan-neuronal expression background rather than a tightly co-regulated spatial program, which is consistent with its near-zero plaque association.

Module 3 is a glutamatergic/synaptic module: Snap25 (synaptic vesicle fusion), Grin2a and Grin2b (NMDA receptor subunits), Opcml (opioid binding protein with synaptogenic function), alongside mitochondrial metabolic enzymes (Got1, Got2, Mdh1) and glycolytic enzymes (Pkm, Pgam1) that support the high energy demands of excitatory synapses. The co-enrichment of synaptic and metabolic genes in a single spatial module points to the tight coupling between glutamatergic neurotransmission and local metabolic support — a coupling that is disrupted near plaques (rho_any = −0.074), confirming synaptic loss as a major spatial feature of the periplaque microenvironment.

Module 4 is an astrocyte module: Gja1 (connexin 43, astrocyte gap junctions), Aqp4 (aquaporin 4, the defining astrocyte water channel), Aldh1l1 (pan-astrocyte marker), and Clu (clusterin/ApoJ, an Alzheimer’s GWAS risk gene expressed by astrocytes) define this cluster unambiguously. Mt2 (metallothionein 2, a metal-chelating and anti-oxidant protein induced in reactive astrocytes) and Gstm1 (glutathione transferase) further suggest a homeostatic astrocyte signature with mild oxidative stress response capacity. The presence of Clu is noteworthy: as a secreted chaperone that facilitates amyloid clearance and a major genetic risk factor for late-onset Alzheimer’s disease, its spatial co-regulation with canonical astrocyte markers implicates the homeostatic astrocyte compartment in local amyloid proteostasis.

Module 5 is a vascular/ECM/innate immune module: Flt1 (VEGF receptor 1, endothelial), Mgp (matrix Gla protein, smooth muscle/pericyte), Vtn (vitronectin, ECM), Col1a2 (collagen I), Bgn (biglycan), and Timp3 (TIMP metalloproteinase inhibitor) define a perivascular niche. The co-occurrence of Ifitm2 and Ifitm3 (interferon-induced transmembrane proteins with broad antiviral and anti-inflammatory functions) within this vascular/ECM module is intriguing: both proteins are upregulated in reactive vascular endothelium and have been independently identified in spatial transcriptomic studies of the Alzheimer’s brain. Their selective enrichment near CAA (rho_CAA = 0.026) suggests an interferon-like innate immune activation specifically in the vascular wall adjacent to cerebrovascular amyloid.

Module 6 is a microglia/DAM (disease-associated microglia) module: it is defined by the canonical DAM gene signature including Cx3cr1, Trem2, C1qa, C1qc, P2ry12, Hexb, Gpr34, and Cd68. The combination of homeostatic (P2ry12, Cx3cr1) and DAM markers (Trem2, C1q) within a single spatially coherent module suggests these cells occupy a transitional state between homeostatic surveillance and phagocytic activation, consistent with the spectrum of microglial states described around plaques by single-cell RNA-seq studies. The tight within-module spatial correlations (mean rho > 0.49) confirm that these genes are expressed in the same spatial domains with high fidelity.

Cell-type enrichment per module

Per-cell metagene scores were computed by averaging normalised expression across all genes in each module, pooling cells from both slides. We average this score across annotated cell types to reveal which cell types are the primary drivers of each module. High average metagene score in a cell type means that cell type tends to be in spatial regions where the module is strongly expressed.

Code

# Compute metagene scores for all cells from both slides.
# Pool expression and cell metadata; use unified module assignments.
expr_norm <- cbind(
  Giotto::getExpression(giotto1, values = "normalized", output = "matrix"),
  Giotto::getExpression(giotto2, values = "normalized", output = "matrix")
)

meta_both <- bind_rows(
  getCellMetadata(giotto1, output = "data.table") |> as_tibble(),
  getCellMetadata(giotto2, output = "data.table") |> as_tibble()
)

# Keep only genes present in both the module list and the expression matrix.
shared_genes <- intersect(mod_unified$gene, rownames(expr_norm))
expr_sub     <- expr_norm[shared_genes, , drop = FALSE]
mod_sub      <- mod_unified |> filter(gene %in% shared_genes)

metagene_scores <- imap(
  split(mod_sub$gene, mod_sub$module),   # list: module → gene names
  \(genes, mod) Matrix::colMeans(expr_sub[genes, , drop = FALSE])
) |>
  bind_rows() |>
  t() |>
  as_tibble(rownames = "cell_ID")

# Join with cell-type annotation.
module_cols <- setdiff(names(metagene_scores), "cell_ID")

celltype_scores <- metagene_scores |>
  left_join(meta_both |> select(cell_ID, annotatedclusters), by = "cell_ID") |>
  filter(!is.na(annotatedclusters)) |>
  pivot_longer(all_of(module_cols),
               names_to = "module", values_to = "score") |>
  group_by(annotatedclusters, module) |>
  summarise(mean_score = mean(score, na.rm = TRUE), .groups = "drop")

# Heatmap: cell types × modules
celltype_scores |>
  ggplot(aes(x = module, y = annotatedclusters, fill = mean_score)) +
  geom_tile(colour = "white", linewidth = 0.3) +
  coord_equal()+
  scale_fill_gradient2(low = "#2166ac", mid = "white", high = "#d6604d",
                       midpoint = 0, name = "Mean\nmetagene") +
  labs(
    title = "Mean spatial module score by cell type",
    subtitle = "High score = cell type concentrated in spatial regions where the module is active",
    x = "Module", y = NULL
  ) +
  theme_minimal(base_size = 12)

Code

celltype_scores |>
  pivot_wider(names_from = module, values_from = mean_score) |>
  arrange(annotatedclusters) |>
  kbl(digits = 3, caption = "Mean metagene score per cell type and module") |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE,
                font_size = 11)

Mean metagene score per cell type and module
annotatedclusters	V1	V2	V3	V4	V5	V6
Oligodendrocyte	4.272	0.783	2.560	1.262	0.298	0.799
Astrocyte	2.121	0.720	2.622	2.963	0.606	1.031
Glutamatergic Neuron 1	1.141	0.994	4.136	0.941	0.194	0.688
Glutamatergic Neuron 2	1.215	1.109	4.018	0.965	0.203	0.676
Microglia	2.056	0.837	1.925	1.019	0.442	3.878
Endothelial	2.101	0.888	2.410	1.461	1.774	1.058
GABAergic Neuron 1	1.441	1.175	3.937	1.225	0.258	0.799
GABAergic Neuron 2	1.409	1.123	3.833	1.151	0.249	0.824
Fibroblast	2.382	0.846	2.088	1.598	1.973	1.351
VSMC	2.080	0.866	2.240	1.567	1.878	1.213
BAM	2.166	0.819	1.897	1.494	1.721	2.717
VLMC	2.352	0.959	2.442	1.706	1.757	1.298
OPC	2.340	1.177	3.170	1.826	0.467	1.170
Glutamatergic Neuron 3	1.453	1.127	4.017	1.028	0.227	0.679
Pericyte	2.043	0.955	2.633	1.614	1.545	1.196
GABAergic Neuron 3	1.348	1.149	3.872	1.280	0.231	0.858
Glutamatergic Neuron 4	1.318	1.104	3.964	1.022	0.206	0.740
CP	1.821	0.982	2.852	1.744	0.988	1.290
GABAergic Neuron 4	1.435	1.231	4.006	1.194	0.293	0.906
Ependymal	1.948	0.943	2.595	3.027	1.024	0.912
Glutamatergic Neuron 5	1.511	1.211	4.010	0.974	0.222	0.719
T Cell	1.994	1.125	2.631	1.557	0.808	1.831

The cell-type enrichment heatmap confirms and extends the gene-based module annotations. Oligodendrocytes score highest on Module 1 (mean metagene = 4.27), consistent with their exclusive expression of myelin genes. Glutamatergic and GABAergic neurons dominate Module 3 (mean scores 3.9–4.1), validating its synaptic identity. Module 4 is most strongly enriched in astrocytes (mean = 2.96) and also in OPCs (2.34) and vascular cell types — a pattern expected given that Aqp4 and Gja1 are also expressed at lower levels in perivascular glia. Module 6 is most highly expressed in microglia (mean = 3.88) and brain-associated macrophages (BAM, mean = 2.72), confirming the DAM identity; the relatively high BAM score is consistent with recent evidence that meningeal and perivascular macrophages share transcriptional features with parenchymal DAM.

Module 5 shows its highest mean scores in non-neuronal cell types: fibroblasts (1.97), VSMC (1.88), VLMC (1.76), and endothelial cells (1.77), consistent with a vascular/stromal origin. Notably, Module 2 displays the lowest metagene scores across virtually all cell types, suggesting it captures a dispersed low-level expression background rather than a concentrated spatial program — and reinforcing the interpretation that this module does not represent a functionally coherent spatial unit.

A cross-cutting observation is that vascular cell types (pericytes, VSMC, VLMC, endothelial) score consistently highly on Module 1 (means 2.04–2.38), despite Module 1 being defined by myelin genes. This likely reflects spatial proximity between white matter tracts and penetrating vessels, rather than direct expression of myelin genes by vascular cells, and underscores a general limitation of metagene scores: they capture spatial co-localisation, not necessarily direct transcriptional expression in the scoring cell type.

Module–plaque–cell-type summary

Finally, we cross the three dimensions: which modules are plaque-associated (rho_flip > 0 for CAA or parenchymal), which are treatment-responsive (|Δrho| > 0.02), and which cell types drive them.

Code

# Aggregate per-module: plaque association, treatment delta, top cell type
module_summary <- plaque_type_df |>
  filter(plaque == "Any") |>
  group_by(module) |>
  summarise(rho_any = median(rho_flip, na.rm = TRUE), .groups = "drop") |>
  left_join(
    plaque_type_df |> filter(plaque == "CAA") |>
      group_by(module) |> summarise(rho_caa = median(rho_flip, na.rm = TRUE), .groups = "drop"),
    by = "module"
  ) |>
  left_join(
    plaque_type_df |> filter(plaque == "Parenchymal") |>
      group_by(module) |> summarise(rho_paren = median(rho_flip, na.rm = TRUE), .groups = "drop"),
    by = "module"
  ) |>
  left_join(
    delta_df |>
      group_by(module) |> summarise(delta_rho = median(delta_rho, na.rm = TRUE), .groups = "drop"),
    by = "module"
  ) |>
  left_join(
    celltype_scores |>
      # module column is "V1"..."V6" because as_tibble() prefixes numeric names with "V"
      mutate(module = as.integer(sub("^V", "", module))) |>
      group_by(module) |>
      slice_max(mean_score, n = 1) |>
      select(module, top_celltype = annotatedclusters),
    by = "module"
  ) |>
  left_join(
    top_genes_per_module |>
      group_by(module) |>
      slice_max(mean_spat_cor, n = 3) |>
      summarise(top_genes = paste(gene, collapse = ", "), .groups = "drop"),
    by = "module"
  )

module_summary |>
  kbl(
    digits = 3,
    col.names = c("Module", "rho (any)", "rho (CAA)", "rho (paren)",
                  "Δrho (Adu−IgG)", "Top cell type", "Top 3 genes")
  ) |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE)

Module	rho (any)	rho (CAA)	rho (paren)	Δrho (Adu−IgG)	Top cell type	Top 3 genes
1	0.021	-0.052	0.031	0.008	Oligodendrocyte	Mog, Cldn11, Ermn
2	-0.013	-0.002	-0.010	0.006	GABAergic Neuron 4	Calb1, Dgkd, Lamp5
3	-0.074	0.010	-0.060	-0.006	Glutamatergic Neuron 1	Opcml, Grin2b, Got1
4	-0.027	-0.007	-0.034	0.001	Ependymal	Gja1, Aqp4, Pla2g7
5	0.022	0.026	-0.008	0.011	Fibroblast	Ifitm3, Ifitm2, Vtn
6	0.119	-0.009	0.107	0.015	Microglia	Cx3cr1, C1qa, C1qc

Taken together, the spatial co-expression module analysis identifies two biologically coherent plaque-responsive programs and four programs whose spatial exclusion from the plaque microenvironment reflects the cellular damage caused by amyloid accumulation.

The strongest signal in the dataset is the microglial/DAM module (Module 6), which is selectively and robustly recruited to parenchymal plaques, consistent with its role as the primary phagocytic responder to fibrillar amyloid, while remaining absent from CAA lesions. The vascular module (Module 5) provides the reciprocal signal: a modest but selective enrichment near CAA, suggesting a spatially distinct innate immune and ECM remodelling response in the perivascular amyloid niche. The three depleted modules (neuronal/synaptic Modules 2–3, astrocytic Module 4) collectively reflect the progressive loss of parenchymal homeostasis in plaque-proximal tissue.