Does a cell’s expression of this gene change systematically as the cell sits closer to (or farther from) a plaque?
Gene expression ~ distance
compute_gene_distance_correlation()
For each gene we compute a Spearman rank correlation (rho) between two vectors that share one entry per cell:
The cell’s raw transcript count for that gene.
The cell’s Euclidean distance (in um) to its nearest plaque of the specified type.
Spearman correlation operates on ranks rather than raw values, so it captures any monotonic relationship (not just linear) and is robust to the extreme zero-inflation typical of spatial transcriptomics count data. We set exact = FALSE because the large number of tied zeros makes an exact permutation p-value computationally infeasible.
A gene that is highly expressed near plaques (short distance) and lowly expressed far away (long distance) will have a negative rho (high expression co-occurs with low distance). Conversely, a gene expressed preferentially far from plaques will have a positive rho.
To make the results more intuitive,so that “plaque-enriched” genes appear at the top of a ranked list rather than the bottom, we multiply rho by -1 to obtain rho_flip. After flipping, positive values mean “enriched near plaques” and negative values mean “enriched far from plaques”.
We apply the Benjamini-Hochberg (BH) procedure to control the false discovery rate, producing an adjusted p-value (padj). A gene is called significant at padj < 0.05.
Final classification. Each gene is labelled as:
enriched_near — padj < 0.05 and rho_flip > 0 (upregulated close to plaques)
enriched_far — padj < 0.05 and rho_flip < 0 (upregulated far from plaques)
ns — not significant
plot_gene_distance_correlation()
Generates a rank plot. Each dot is one gene.
The x-axis is the gene’s rank (1 = strongest plaque enrichment) and the y-axis is rho_flip (the sign-flipped Spearman correlation). Genes sitting high on the plot are preferentially expressed near plaques; genes at the bottom are expressed far from plaques. The dashed line at zero marks no association.
Orange = significantly enriched near plaques; purple = significantly enriched far from plaques; grey = not significant (BH-adjusted p >= 0.05). The top and bottom genes are automatically labelled, plus any user-specified genes of interest.
Three analyses are shown, each using a different distance metric: (1) distance to nearest CAA plaque only, (2) distance to nearest parenchymal plaque only, and (3) distance to nearest plaque of any type. Comparing across the three reveals whether certain genes are specifically associated with one plaque subtype or with amyloid deposits in general.
Code
plot_gene_distance_correlation(cor_caa, title ="Distance to nearest CAA plaque")
For each gene, Spearman’s rank correlation coefficient (ρ) was calculated between per-cell gene expression levels (raw count data) and the minimum distance to the nearest amyloid plaque. To facilitate interpretation, correlation coefficients were sign-inverted such that positive values indicate higher gene expression in cells located closer to plaques, whereas negative values indicate higher expression at increasing distances from plaques. Genes were ranked by the inverted correlation coefficient. Statistical significance was assessed using two-sided Spearman correlation tests followed by Benjamini–Hochberg false discovery rate correction. Genes were classified as plaque-enriched, plaque-distal, or not significant based on adjusted P values. The five most positive and five most negative correlations are labeled
Code
plot_gene_distance_correlation(cor_paren, title ="Distance to nearest parenchymal plaque")
Code
plot_gene_distance_correlation(cor_any, title ="Distance to nearest plaque (any type)")
Top genes
Code
show_top <-function(results, label) { top <-bind_rows( results |>filter(sig =="enriched_near") |>slice_min(rank, n =10), results |>filter(sig =="enriched_far") |>slice_max(rank, n =10) ) |>select(gene, rho, rho_flip, pval, padj, rank, sig) |>mutate(across(c(rho, rho_flip), \(x) round(x, 4)),across(c(pval, padj), \(x) ifelse(x <1e-300, "< 1e-300", formatC(x, format ="e", digits =2)))) top |>kbl(caption = label) |>kable_styling("striped", full_width =FALSE)}show_top(cor_caa, "Top genes — distance to CAA")
Top genes — distance to CAA
gene
rho
rho_flip
pval
padj
rank
sig
Lamp5
-0.1404
0.1404
< 1e-300
< 1e-300
1
enriched_near
Stard8
-0.1242
0.1242
< 1e-300
< 1e-300
2
enriched_near
Slc17a7
-0.1184
0.1184
< 1e-300
< 1e-300
3
enriched_near
Cux2
-0.1165
0.1165
< 1e-300
< 1e-300
4
enriched_near
Bhlhe40
-0.1131
0.1131
< 1e-300
< 1e-300
5
enriched_near
Rspo1
-0.1028
0.1028
< 1e-300
< 1e-300
6
enriched_near
Pparg
-0.0981
0.0981
< 1e-300
< 1e-300
7
enriched_near
Serpinf1
-0.0850
0.0850
< 1e-300
< 1e-300
8
enriched_near
Egr1
-0.0817
0.0817
< 1e-300
< 1e-300
9
enriched_near
Lum
-0.0800
0.0800
< 1e-300
< 1e-300
10
enriched_near
Mbp
0.1356
-0.1356
< 1e-300
< 1e-300
459
enriched_far
Ermn
0.0896
-0.0896
< 1e-300
< 1e-300
458
enriched_far
Nts
0.0888
-0.0888
< 1e-300
< 1e-300
457
enriched_far
Prdx1
0.0855
-0.0855
< 1e-300
< 1e-300
456
enriched_far
Syt6
0.0844
-0.0844
< 1e-300
< 1e-300
455
enriched_far
Slc32a1
0.0837
-0.0837
< 1e-300
< 1e-300
454
enriched_far
Tac1
0.0789
-0.0789
< 1e-300
< 1e-300
453
enriched_far
B3galt5
0.0785
-0.0785
< 1e-300
< 1e-300
452
enriched_far
Mog
0.0770
-0.0770
< 1e-300
< 1e-300
451
enriched_far
Mag
0.0751
-0.0751
< 1e-300
< 1e-300
450
enriched_far
Code
show_top(cor_paren, "Top genes — distance to parenchymal")
Top genes — distance to parenchymal
gene
rho
rho_flip
pval
padj
rank
sig
Cst7
-0.2954
0.2954
< 1e-300
< 1e-300
1
enriched_near
Trem2
-0.2146
0.2146
< 1e-300
< 1e-300
2
enriched_near
Bcl2a1b
-0.2043
0.2043
< 1e-300
< 1e-300
3
enriched_near
Lyz2
-0.2000
0.2000
< 1e-300
< 1e-300
4
enriched_near
Cd68
-0.1913
0.1913
< 1e-300
< 1e-300
5
enriched_near
C1qa
-0.1874
0.1874
< 1e-300
< 1e-300
6
enriched_near
C1qc
-0.1839
0.1839
< 1e-300
< 1e-300
7
enriched_near
Ctsd
-0.1829
0.1829
< 1e-300
< 1e-300
8
enriched_near
Ly86
-0.1803
0.1803
< 1e-300
< 1e-300
9
enriched_near
Ccl3
-0.1779
0.1779
< 1e-300
< 1e-300
10
enriched_near
Psap
0.1504
-0.1504
< 1e-300
< 1e-300
459
enriched_far
Ids
0.1468
-0.1468
< 1e-300
< 1e-300
458
enriched_far
Calb1
0.1282
-0.1282
< 1e-300
< 1e-300
457
enriched_far
Gaa
0.1248
-0.1248
< 1e-300
< 1e-300
456
enriched_far
Srebf2
0.1237
-0.1237
< 1e-300
< 1e-300
455
enriched_far
Pygb
0.1233
-0.1233
< 1e-300
< 1e-300
454
enriched_far
Fdft1
0.1198
-0.1198
< 1e-300
< 1e-300
453
enriched_far
Smpd1
0.1154
-0.1154
< 1e-300
< 1e-300
452
enriched_far
Dnah11
0.1146
-0.1146
< 1e-300
< 1e-300
451
enriched_far
Ptges3
0.1128
-0.1128
< 1e-300
< 1e-300
450
enriched_far
Code
show_top(cor_any, "Top genes — distance to any plaque")
p1 <-plot_gene_distance_correlation(cor_caa_adu, title ="CAA — Aducanumab")p2 <-plot_gene_distance_correlation(cor_caa_igg, title ="CAA — IgG")p1 + p2
Code
pa <-plot_gene_distance_correlation(cor_paren_adu, title ="Parenchymal — Aducanumab")pi <-plot_gene_distance_correlation(cor_paren_igg, title ="Parenchymal — IgG")pa + pi
Code
a1 <-plot_gene_distance_correlation(cor_any_adu, title ="Any plaque — Aducanumab")a2 <-plot_gene_distance_correlation(cor_any_igg, title ="Any plaque — IgG")a1 + a2
Adu vs IgG scatter
Each point is a gene. X-axis = rho_flip in IgG; Y-axis = rho_flip in Aducanumab. Points are coloured by concordance of significance: both enriched near, both enriched far, discordant, or non-significant in both.