Spatial autocorrelation metrics

Giotto implementation: Moran’s I, Geary’s C, and binSpect — kNN(15) and Delaunay spatial networks

Published

March 27, 2026

Setup and data loading

Code

source("R/setup.R")
library(Giotto)
library(patchwork)
library(ggrepel)
library(spdep)
options(future.globals.maxSize = 8 * 1024^3)  # 8 GiB for large Giotto objects

data <- load_slides_and_distances("full")
slide1 <- data$slide1; slide2 <- data$slide2
cell_distances_all <- data$distances_all

Convert Seurat to Giotto

Giotto status — seuratToGiottoV5(): BROKEN

seuratToGiottoV5() is incompatible with SeuratObject ≥ 5 (defunct slot argument). The function errors immediately and cannot be used.

Workaround: Giotto objects are built manually using createGiottoObject() + normalizeGiotto() + addCellMetadata(). All three of these lower-level functions work natively without issues.

Code

# seuratToGiottoV5() is incompatible with SeuratObject >= 5 (defunct slot arg).
# Build Giotto objects manually from expression + coordinates + metadata.
seurat_to_giotto_manual <- function(seurat_obj) {
  expr_mat <- LayerData(seurat_obj, assay = DefaultAssay(seurat_obj), layer = "counts")

  coords_df <- GetTissueCoordinates(seurat_obj) |>
    as.data.frame() |>
    dplyr::select(cell, x, y) |>
    column_to_rownames("cell") |>
    dplyr::rename(sdimx = x, sdimy = y)

  shared <- intersect(colnames(expr_mat), rownames(coords_df))
  expr_mat  <- expr_mat[, shared]
  coords_df <- coords_df[shared, ]

  g <- createGiottoObject(
    expression   = list(raw = expr_mat),
    spatial_locs = coords_df
  )

  # Normalize (required for spdepAutoCorr)
  g <- normalizeGiotto(g, verbose = FALSE)

  # Add cell-type metadata
  meta <- seurat_obj@meta.data |>
    as_tibble(rownames = "cell") |>
    dplyr::select(cell_ID = cell, annotatedclusters)

  g <- addCellMetadata(g, new_metadata = meta, by_column = TRUE, column_cell_ID = "cell_ID")
  g
}

giotto1 <- seurat_to_giotto_manual(slide1)
giotto2 <- seurat_to_giotto_manual(slide2)

Spatial networks

Giotto status — createSpatialNetwork(): native, works

Both method = "Delaunay" and method = "kNN" work correctly and return the updated giotto object as documented. No issues observed.

Spatial autocorrelation analyses quantify whether nearby cells have more similar gene expression than expected by chance (positive autocorrelation; “spatial clustering”), or instead tend to differ (negative autocorrelation; “spatial dispersion”). The definition of “nearby” is therefore a modelling choice that must be made explicit, and it is encoded in a spatial weights/neighbor graph.

We build two types of spatial networks per slide, because they address different biological questions:

Delaunay triangulation: connects immediate spatial neighbours; used for cell-type co-localisation enrichment (who tends to sit next to whom), which is naturally framed as adjacency.
k-nearest neighbours (k=15): connects each cell to its 15 closest neighbours; used for Moran’s I and Geary’s C, which require a weights structure over neighbors and are easier to compare across regions when neighborhood size is fixed.

For Moran’s I / Geary’s C, we use row-standardized weights (style = "W" in spdep::mat2listw), which rescales each cell’s outgoing neighbour weights to sum to 1. This reduces sensitivity to differences in effective connectivity (for example, edge cells versus interior cells), because each cell contributes comparably to the global statistic after normalization.

kNN neighbourhoods are inherently directed/asymmetric: cell A can include cell B among its 15 nearest neighbours even if cell B does not include cell A. We convert this directed graph to an undirected (symmetrized) weight matrix using createSpatialWeightMatrix(), which adds the reverse edge wherever it is missing.

Symmetrization satisfies the requirement for symmetric weights assumed by classical Moran’s I inference, and means each cell’s effective neighbourhood includes both the cells it selected and those that selected it. Under row-standardization this produces somewhat larger effective neighbourhoods and consequently more conservative (lower) Moran’s I estimates than a raw directed-edge approach. We treat Moran’s I and Geary’s C primarily as ranking statistics and interpret analytical p-values cautiously.

Code

# Note: tried paralleliaation with mclapply biu it does not work here: Giotto objects contain
# C++ pointers and data.table internals that break fork()-based workers.
# Sequential execution is required.
giotto1 <- createSpatialNetwork(giotto1, method = "Delaunay", name = "delaunay")
giotto1 <- createSpatialNetwork(giotto1, method = "kNN", k = 15, name = "knn15")

giotto2 <- createSpatialNetwork(giotto2, method = "Delaunay", name = "delaunay")
giotto2 <- createSpatialNetwork(giotto2, method = "kNN", k = 15, name = "knn15")

qs2::qs_save(giotto1, "spatial/giotto1.qs2")
qs2::qs_save(giotto2, "spatial/giotto2.qs2")

Code

giotto1 <- qs2::qs_read("spatial/giotto1.qs2")
giotto2 <- qs2::qs_read("spatial/giotto2.qs2")

Moran’s I — global spatial autocorrelation

Giotto status — Moran’s I pipeline: native with one workaround

createSpatialWeightMatrix() — native, works. Returns the raw adjacency matrix; passed to spdep::mat2listw() for row-standardized weights.
getExpression() — namespace conflict. The future package exports a getExpression() generic that masks Giotto’s when future is loaded (which happens automatically in the treatment-stratified section). Works here because library(Giotto) is loaded before library(future). When in doubt, use Giotto::getExpression() explicitly.
spdep::moran.test() — external package (spdep), not Giotto. Works natively.
spdepAutoCorr() (treatment section) — Giotto-native wrapper around spdep; works, but has no internal tryCatch, so zero-variance genes must be pre-filtered manually before calling or it crashes.

Global Moran’s I summarizes whether a gene’s expression is spatially clustered across the full tissue section at the scale defined by the kNN(15) graph.
Intuitively, Moran’s I increases when cells tend to resemble their neighbours (e.g., high-expression cells near other high-expression cells), producing broad spatial domains, while values near 0 indicate little spatial structure at this scale. Negative values indicate local alternation (neighbours tending to differ), which is less common in many biological expression maps but can occur for sharply alternating patterns.
Here, Moran’s I is used mainly as a ranking statistic: genes with higher Moran’s I show stronger spatial patterning and are more likely to reflect anatomy, cell-type organization, or spatially restricted biological processes.

Code

# Uses Giotto's createSpatialWeightMatrix() to build the spdep listw object,
compute_moran_direct <- function(giotto_obj, network_name = "knn15") {
  wm  <- createSpatialWeightMatrix(
    giotto_obj, spatial_network_to_use = network_name,
    method = "adjacency", return_gobject = FALSE, verbose = FALSE
  )
  lw  <- spdep::mat2listw(wm, style = "W", zero.policy = TRUE)

  expr_mat <- Giotto::getExpression(giotto_obj, values = "normalized", output = "matrix")
  gene_var  <- apply(expr_mat, 1, var)
  genes     <- names(gene_var[gene_var > 0])
  message(sprintf("Testing %d / %d genes (non-zero variance)", length(genes), nrow(expr_mat)))

  purrr::map(genes, \(g) tryCatch(
    {
      mt <- spdep::moran.test(as.numeric(expr_mat[g, ]), lw, zero.policy = TRUE)
      tibble(gene = g, moran_I = mt$estimate[["Moran I statistic"]], pval = mt$p.value)
    },
    error = \(e) tibble(gene = g, moran_I = NA_real_, pval = NA_real_)
  ), .progress = "Moran's I") |>
    list_rbind() |>
    filter(!is.na(moran_I))
}

moran_1 <- compute_moran_direct(giotto1) |> mutate(slide = "Slide 1")
moran_2 <- compute_moran_direct(giotto2) |> mutate(slide = "Slide 2")

moran_combined <- bind_rows(moran_1, moran_2)

# write_csv(moran_combined, "spatial/moran_combined.csv")

Moran’s I rank plot

Moran’s I measures whether nearby cells have similar expression values (I > 0: spatial clustering; I ≈ 0: spatial randomness; I < 0: spatial dispersion).

Interpretation note: The analytical p-values from spdep::moran.test assume a symmetric weights matrix (e.g., if Cell A is connected to Cell B, then Cell B should be connected to Cell A). However, the K-Nearest Neighbor (kNN) networks used to map spatial data do not actually guarantee this symmetry.
Because this broken assumption makes the fast p-values unreliable, spdep explicitly recommends checking results using permutation-based alternatives such asmoran.mc (ultra-slow) when inference is critical.
For this reason, we interpret Moran’s I primarily as an effect size for ranking genes by spatial structure, and treat p-values as secondary.

Code

moran_combined <- read_csv("spatial/moran_combined.csv")
moran_mean <- moran_combined |>
  group_by(gene) |>
  summarise(mean_moran = mean(moran_I, na.rm = TRUE), .groups = "drop") |>
  arrange(desc(mean_moran)) |>
  mutate(rank = row_number())

top_label <- bind_rows(
  moran_mean |> slice_head(n = 10),
  moran_mean |> slice_tail(n = 5)
)

ggplot(moran_mean, aes(x = rank, y = mean_moran)) +
  geom_point(alpha = 0.5, colour = "grey50") +
  geom_text_repel(
    data = top_label,
    aes(label = gene),
    size = 3.5, fontface = "italic", max.overlaps = 20
  ) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "red") +
  labs(
    title = "Global Moran's I — mean across both slides",
    x = "Gene rank", y = "Mean Moran's I"
  ) +
  theme_minimal(base_size = 13)+
  theme(panel.grid.minor = element_blank(),
  panel.grid.major.x = element_blank())

Global Moran’s I was computed for each gene in the Xenium panel using a k=15 nearest-neighbour spatial weight matrix with row-standardized weights.

Most genes exhibited positive Moran’s I (range: −0.0003 to 0.40; nine genes showed marginally negative values, all effectively zero), consistent with the spatially organized architecture of the mouse brain.

The distribution was right-skewed (median I = 0.039, mean I = 0.057), with 39.3% of genes showing I ≥ 0.05 and 3.4% showing I ≥ 0.20. The overall lower values compared to directed-edge approaches reflect the larger effective neighbourhoods produced by graph symmetrization under row-standardization..

The most spatially structured genes were: Slc17a7 (I = 0.40, excitatory neurons), Mbp (I = 0.29, myelin/white matter), Lamp5 (I = 0.29, cortical interneurons), Car12 (I = 0.27, oligodendrocyte lineage), and Ttr (I = 0.26, choroid plexus). These reflect normal brain laminar and regional architecture rather than disease processes. At the opposite extreme, genes expressed by rare, scattered populations showed near-zero Moran’s I (Ly6g, Chil3, Apoc2, Il1r2), indicating spatially random expression.

Spatial autocorrelation estimates were highly reproducible between slides (Pearson r = 0.959; Spearman rho = 0.924).

binSpect — binary spatially variable genes

Giotto status — binSpect(): native, works

binSpect() runs without errors. Two things worth noting:

A showGiottoInstructions() deprecation warning fires at startup — benign, can be ignored.
return_gobject = FALSE must be set to get results as a data.frame/tibble rather than an updated giotto object. Without this, downstream tidyverse operations fail.

No workarounds required beyond those two flags.

binSpect (Binary Spatial Extraction) is complementary to Moran’s I: rather than computing a global covariance statistic over continuous expression values, it binarizes each gene’s expression into “high” and “low” cells (rank-based threshold: top 30% of expressing cells) and then tests via Fisher exact test whether high-expressing cells are enriched among their spatial neighbours.

The rank method is more robust than k-means binarization for sparse Xenium panels where many genes have few expressing cells. Simulation studies show binSpect is more robust to noise than continuous methods and better at recovering known spatial genes, particularly for discrete expression domains rather than broad gradients.

Comparing binSpect odds-ratio rank with Moran’s I rank allows us to distinguish:

High on both: robustly spatial genes detected by both continuous and binary metrics
High binSpect / low Moran’s I: discrete local hubs (tight spatial clusters of a few cells)
High Moran’s I / low binSpect: broad anatomical gradients with graded, non-binary expression

Code

# binSpect() single-kernel variant (one spatial network at a time).
# Use binSpectMulti() for multi-kernel testing across several networks.
# The showGiottoInstructions() deprecation warning is benign.
# Run first, then re-render; outputs saved to spatial/binspect_combined.csv.
bs1 <- binSpect(
  giotto1, spatial_network_name = "delaunay",
  bin_method = "rank", do_fisher_test = TRUE,
  adjust_method = "fdr", return_gobject = FALSE
) |> mutate(slide = "Slide 1")

bs2 <- binSpect(
  giotto2, spatial_network_name = "delaunay",
  bin_method = "rank", do_fisher_test = TRUE,
  adjust_method = "fdr", return_gobject = FALSE
) |> mutate(slide = "Slide 2")

bs_combined <- bind_rows(bs1, bs2)
# write_csv(bs_combined, "spatial/binspect_combined.csv")

Code

bs_combined <- read_csv("spatial/binspect_combined.csv")

bs_mean <- bs_combined |>
  group_by(feats) |>
  summarise(mean_estimate = mean(estimate, na.rm = TRUE), .groups = "drop") |>
  arrange(desc(mean_estimate)) |>
  mutate(bs_rank = row_number())

moran_vs_bs <- bs_mean |>
  inner_join(moran_mean |> dplyr::select(gene, mean_moran, rank), by = c("feats" = "gene")) |>
  rename(moran_rank = rank)

top_bs <- bind_rows(
  moran_vs_bs |> slice_head(n = 10),
  moran_vs_bs |> slice_tail(n = 5)
)

p1 <- ggplot(bs_mean, aes(x = bs_rank, y = mean_estimate)) +
  geom_point(alpha = 0.5, colour = "grey50") +
  geom_text_repel(
    data = top_bs,
    aes(x = bs_rank, label = feats),
    size = 3.5, fontface = "italic", max.overlaps = 20) +
  labs(
    title = "binSpect: Fisher odds ratio",
    subtitle = "mean across both slides",
    x = "Gene rank (binSpect)", y = "Mean odds ratio (estimate)") +
  theme_minimal(base_size = 13)

p2 <- ggplot(moran_vs_bs, aes(x = moran_rank, y = bs_rank)) +
  geom_point(alpha = 0.4, colour = "grey50") +
  geom_text_repel(
    data = moran_vs_bs |> filter(abs(moran_rank - bs_rank) > quantile(abs(moran_rank - bs_rank), 0.97)),
    aes(label = feats),
    size = 3.5, fontface = "italic", max.overlaps = 15) +
  geom_abline(slope = 1, intercept = 0, linetype = "dashed", colour = "red") +
  labs(
    title = "Moran's I rank vs binSpect rank",
    subtitle = "Off-diagonal: local hubs (high binSpect)\n vs broad gradients (high Moran's I)",
    x = "Moran's I rank (1 = most spatial)",
    y = "binSpect rank (1 = highest odds ratio)") +
  theme_minimal(base_size = 13)

p1 + p2

The binSpect odds ratio distribution is extremely right-skewed, dominated by a handful of genes expressed in rare, spatially tightly clustered cell types.

The top-ranked genes are mast cell markers — Tpsb2 (odds ratio ≈ 2000), Fcer1a (~1000), and Tpsab1 (~850) — indicating that when a mast cell is present, its immediate spatial neighbours are almost exclusively also mast cells.

Other high-ranking genes include ciliary/ependymal markers (Ccdc153, Tmem212, Dnah11), and the NK cell marker Klrb1c. The vast majority of genes have near-zero odds ratios, indicating no discrete local spatial enrichment.

Comparing binSpect rank with Moran’s I rank reveals two distinct populations. Most genes cluster near the diagonal, meaning both methods agree on their degree of spatial structure. The key exception is a cluster of genes with high binSpect odds ratios but very low Moran’s I (bottom-right in the rank scatter): mast cell markers (Tpsb2, Tpsab1, Fcer1a) alongside T cell, NK cell, and inflammatory markers (Gzmb, Nkg7, Cd3d, Itk, Il1b, Tnf, Ccl4, Mgl2, Cd209a). These immune cell genes are invisible to Moran’s I because the cells are too rare to drive global spatial autocorrelation, yet binSpect detects them as extreme local hubs — when one immune cell is present, its neighbours are disproportionately also immune cells. In the context of amyloid pathology, this points to spatially clustered immune infiltration that a global statistic would entirely miss.

Moran’s I vs distance correlation

Moran’s I alone does not identify what drives spatial structure. A gene can have high Moran’s I simply because it marks anatomy or cell-type domains, even if it is unrelated to plaques.

To separate these effects, we combine Moran’s I with an independent plaque-association measure: the Spearman correlation between gene expression and cell-to-plaque distance (with the sign flipped so that positive values indicate higher expression nearer plaques).

High Moran’s I + high rho_flip: spatially clustered AND plaque-associated (e.g., DAM genes)
High Moran’s I + low rho_flip: spatially structured but NOT due to plaques (brain architecture)
Low Moran’s I + high rho_flip: plaque-associated but not broadly spatial (focal plaque response)
Low Moran’s I + low rho_flip: neither spatial nor plaque-associated

Code

cor_any <- qs2::qs_read("spatial/cor_any.qs2")

moran_vs_dist <- moran_mean |>
  inner_join(cor_any |> dplyr::select(gene, rho_flip, sig), by = "gene")



label_set <- moran_vs_dist |>
  filter(
    #gene %in% dam_genes |
    mean_moran > quantile(mean_moran, 0.99)|
    rho_flip > quantile(abs(rho_flip), 0.99) |
    (mean_moran > quantile(mean_moran, 0.90) & abs(rho_flip) > quantile(abs(rho_flip), 0.9)) |
    (mean_moran > quantile(mean_moran, 0.90) & abs(rho_flip) < quantile(abs(rho_flip), 0.1)) |
    (mean_moran < quantile(mean_moran, 0.25) & abs(rho_flip) > quantile(abs(rho_flip), 0.9)))

ggplot(moran_vs_dist, aes(x = rho_flip, y = mean_moran)) +
  geom_point(aes(colour = sig), alpha = 0.5) +
  geom_text_repel(
    data = label_set,
    aes(label = gene),
    size = 3.5, fontface = "italic", max.overlaps = 20) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "grey60") +
  geom_vline(xintercept = 0, linetype = "dashed", colour = "grey60") +
  scale_colour_manual(
    values = c("enriched_near" = "#E07B39", "enriched_far" = "#7B68AE", "ns" = "grey70"),
    labels = c("enriched_near" = "Enriched near plaque", "enriched_far" = "Enriched far", "ns" = "NS")) +
  labs(
    title = "Moran's I vs gene-distance correlation",
    subtitle = "Top-right: spatially clustered AND plaque-associated",
    x = "Spearman rho (flipped; + = near plaque)",
    y = "Mean Moran's I",
    colour = "Distance sig."
  ) +
  theme_minimal(base_size = 13) +
  theme(legend.position = "bottom")

To distinguish anatomy-driven from plaque-driven spatial gene expression, we combined Moran’s I with the Spearman correlation between gene expression and distance to the nearest plaque (sign-flipped so positive values indicate enrichment near plaques). Genes partitioned into four quadrants. Of 459 tested genes, 81 fell in the “spatial + plaque” quadrant (above-median Moran’s I and positive rho_flip), 149 in “spatial only,” 112 in “plaque only,” and 117 in “neither.”

The labeled genes separate into two clear groups. First, the strongest global spatial genes were largely plaque-distant/anatomy-linked: Car12 (I=0.265, rho_flip=−0.126), Calb1 (0.223, −0.130), Ccdc153 (0.222, −0.088), Tmem212 (0.206, −0.078), Ids (0.200, −0.158), Chst1 (0.196, −0.126), and Fdft1 (0.141, −0.138), consistent with broad anatomical domains rather than plaque-local biology.

Second, canonical DAM genes were strongly plaque-associated but only modestly spatial by Moran’s I: Cst7 (0.078, 0.295), Lyz2 (0.051, 0.220), Trem2 (0.044, 0.216), Bcl2a1b (0.039, 0.208), and Cd68 (0.038, 0.198), consistent with focal plaque-proximal microglial activation. Mbp was notable as the highest-ranking gene combining strong spatial structure (I = 0.29, rank 2 overall) with meaningful plaque association (rho_flip = 0.12), likely because parenchymal plaques form preferentially in myelinated regions.

Treatment-stratified Moran’s I

Giotto status — spdepAutoCorr() in parallel: native with fine-tuning

spdepAutoCorr() — Giotto-native Moran’s I wrapper; works. Used here instead of the manual compute_moran_direct() to leverage its future.apply parallelisation.
Zero-variance gene filter — must be applied on the Seurat object before Giotto conversion. spdepAutoCorr() has no internal tryCatch; passing zero-variance genes causes it to crash with an uninformative error.
future parallel session — plan(multisession, workers = 12) + options(future.globals.maxSize = 4 * 1024^3) are required. Without raising the globals limit, future workers silently fail when serializing the large Giotto objects.
getExpression() masking — loading library(future) here masks Giotto::getExpression(). For this reason, the zero-variance filter is done on the Seurat counts matrix directly rather than extracting from the giotto object.

Next we tested if spatial autocorrelation differ between Adu- and IgG-treated tissue. For that, we subset each slide by treatment and recomputed Moran’s I.

Code

# Suppress Giotto's Python initialisation attempts — analysis is pure R.
options(giotto.py_path = NULL)

# Uses spdepAutoCorr() (Giotto-native Moran's I wrapper) instead of the manual
# compute_moran_direct(). Pre-filter zero-variance genes before calling since
# spdepAutoCorr() has no internal tryCatch.

# spdepAutoCorr() uses future.apply internally for gene-level iteration —
# multisession lets each of the 4 group calls use all cores for its gene loop.
# future.globals.maxSize raised here because each worker receives the full
# Giotto object; 8 GiB set in setup is for the main session, not per-worker.
library(future)
plan(multisession, workers = 12)
options(future.globals.maxSize = 4 * 1024^3)  # 4 GiB per worker

run_treatment_moran <- function(seurat_subset, group_label) {
  # Filter zero-variance genes on the Seurat object before Giotto conversion —
  # avoids calling getExpression() on the giotto S4 class, which has dispatch
  # issues in this Giotto version.
  counts <- LayerData(seurat_subset, assay = DefaultAssay(seurat_subset), layer = "counts")
  keep   <- rownames(counts)[apply(counts, 1, var) > 0]
  seurat_subset <- seurat_subset[keep, ]

  g <- seurat_to_giotto_manual(seurat_subset)
  g <- createSpatialNetwork(g, method = "kNN", k = 15, name = "knn15")
  spdepAutoCorr(g, method = "moran.test", spatial_network_to_use = "knn15") |>
    as_tibble() |>
    dplyr::select(gene = feat_ID, moran_I = value) |>
    mutate(group = group_label)
}

moran_adu <- bind_rows(
  run_treatment_moran(subset(slide1, subset = Treatment.Group == "Adu"), "Adu_slide1"),
  run_treatment_moran(subset(slide2, subset = Treatment.Group == "Adu"), "Adu_slide2")
)

moran_igg <- bind_rows(
  run_treatment_moran(subset(slide1, subset = Treatment.Group == "IgG"), "IgG_slide1"),
  run_treatment_moran(subset(slide2, subset = Treatment.Group == "IgG"), "IgG_slide2")
)

moran_by_treatment <- bind_rows(
  moran_adu |> mutate(treatment = "Adu"),
  moran_igg |> mutate(treatment = "IgG")
) |>
  group_by(gene, treatment) |>
  summarise(mean_moran = mean(moran_I, na.rm = TRUE), .groups = "drop")

plan(sequential)  # release worker processes before rest of session

# write_csv(moran_by_treatment, "spatial/moran_by_treatment.csv")

Code

moran_by_treatment <- read_csv("spatial/moran_by_treatment.csv")

moran_treat_wide <- moran_by_treatment |>
  pivot_wider(names_from = treatment, values_from = mean_moran)

label_treat <- moran_treat_wide |>
  filter(
    abs(Adu - IgG) > quantile(abs(moran_treat_wide$Adu - moran_treat_wide$IgG), 0.98, na.rm = TRUE)
  )

ggplot(moran_treat_wide, aes(x = IgG, y = Adu)) +
  geom_point(alpha = 0.6, size = 2, colour = "grey60") +
  geom_abline(slope = 1, intercept = 0, linetype = "dashed", colour = "red") +
  geom_text_repel(
    data = label_treat,
    aes(label = gene),
    size = 4.5, fontface = "italic", max.overlaps = 15,
  ) +
  labs(
    title = "Moran's I: Adu vs IgG",
    subtitle = "Genes off-diagonal have treatment-dependent spatial structure",
    x = "Mean Moran's I (IgG)", y = "Mean Moran's I (Adu)"
  ) +
  theme_minimal(base_size = 13) +
  coord_equal()

Most genes clustered near the identity line \((I_{\mathrm{Adu}} \approx I_{\mathrm{IgG}})\), indicating that overall transcriptome-wide spatial organization is largely preserved between treatment groups.

To detect treatment-sensitive spatial changes, we ranked genes by \((|I_{\mathrm{Adu}} - I_{\mathrm{IgG}}|)\) and selected the top 2% most off-diagonal genes.

The genes that gain spatial structure in Adu (Cxcl10, Plin5) point to focal inflammatory and lipid-metabolic responses around plaque-clearing sites, consistent with the treatment mechanism.

The genes losing spatial structure in Adu (Nts, Tac1, Egr1) suggest reduced spatially-organized neuronal activity patterns, which could reflect altered network activity after plaque burdenreduction, or conversely a mild perturbation from the immune response. The labelled set is small (top 2%), so these are exploratory signals rather than conclusions.

Tmem212 is essentially a positive control for the assay. It is is a cilia-associated gene expressed almost exclusively in ependymal cells, which line the ventricles in a thin, anatomically fixed sheet. Because ependymal cells are locked to the ventricular walls, their marker gene naturally has extreme spatial clustering.

Because Moran’s I quantifies global spatial dependence, these differences reflect altered spatial patterning rather than mean-expression shifts alone.

Geary’s C — complementary spatial measure

Giotto status — Geary’s C pipeline: native, works

Identical pipeline to Moran’s I: createSpatialWeightMatrix() + spdep::geary.test(). The same getExpression() namespace note applies (load Giotto before future, or use Giotto::getExpression() explicitly), but otherwise no issues. No Giotto-specific workarounds required.

Moran’s I and Geary’s C are both global measures of spatial autocorrelation computed on the same neighbor graph, but they emphasize different aspects of spatial structure. Moran’s I is covariance/correlation-like and is most sensitive to broad spatial domains, whereas Geary’s C is based on squared neighbour-to-neighbour differences and is therefore more sensitive to local heterogeneity and sharp boundaries.

For Geary’s C, C ~1 indicates no spatial autocorrelation, C < 1 indicates positive autocorrelation (neighbours are similar), and C > 1 indicates negative autocorrelation (neighbours are dissimilar).

As a result, genes with high Moran’s I often tend to have low Geary’s C, so plotting Moran vs Geary provides a simple consistency check and can highlight genes where “spatial structure” is dominated by local discontinuities.

Code

compute_geary_direct <- function(giotto_obj, network_name = "knn15") {
  wm  <- createSpatialWeightMatrix(
    giotto_obj, spatial_network_to_use = network_name,
    method = "adjacency", return_gobject = FALSE, verbose = FALSE
  )
  lw  <- spdep::mat2listw(wm, style = "W", zero.policy = TRUE)

  expr_mat <- Giotto::getExpression(giotto_obj, values = "normalized", output = "matrix")
  gene_var  <- apply(expr_mat, 1, var)
  genes     <- names(gene_var[gene_var > 0])

  purrr::map(genes, \(g) tryCatch(
    {
      gt <- spdep::geary.test(as.numeric(expr_mat[g, ]), lw, zero.policy = TRUE)
      tibble(gene = g, geary_C = gt$estimate[["Geary C statistic"]], pval = gt$p.value)
    },
    error = \(e) tibble(gene = g, geary_C = NA_real_, pval = NA_real_)
  ), .progress = "Geary's C") |>
    list_rbind() |>
    filter(!is.na(geary_C))
}

geary_1 <- compute_geary_direct(giotto1) |> mutate(slide = "Slide 1")
geary_2 <- compute_geary_direct(giotto2) |> mutate(slide = "Slide 2")

geary_mean <- bind_rows(geary_1, geary_2) |>
  group_by(gene) |>
  summarise(mean_geary = mean(geary_C, na.rm = TRUE), .groups = "drop")

# write_csv(geary_mean, "spatial/geary_mean.csv")

Code

geary_mean <- read_csv("spatial/geary_mean.csv")

geary_vs_moran <- geary_mean |>
  inner_join(moran_mean, by = "gene")

ggplot(geary_vs_moran, aes(x = mean_moran, y = mean_geary)) +
  geom_point(alpha = 0.4, colour = "grey50") +
  geom_text_repel(
    data = geary_vs_moran |> filter(gene %in% label_set$gene),
    aes(label = gene),
    size = 3.5, fontface = "italic"
  ) +
  labs(
    title = "Moran's I vs Geary's C",
    subtitle = "High Moran's I should correspond to low Geary's C",
    x = "Mean Moran's I", y = "Mean Geary's C"
  ) +
  theme_minimal(base_size = 13)

To validate the Moran’s I findings, we computed Geary’s C for all genes using the same kNN(15) spatial weight matrix. Geary’s C is based on squared neighbour-to-neighbour differences and is therefore more sensitive to local heterogeneity than Moran’s I, which captures global covariance structure. As expected, the two statistics showed a strong inverse relationship: genes with high Moran’s I exhibited low Geary’s C (positive autocorrelation confirmed by both measures), and vice versa. The tight inverse correlation between the two statistics indicates that the spatial autocorrelation rankings are robust and not artefacts of the specific test statistic used.