--- title: "tlsR Workflow: From Raw Imaging Data to TLS Characterisation" author: "Ali Amiryousefi" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true toc_depth: 3 vignette: > %\VignetteIndexEntry{tlsR Workflow: From Raw Imaging Data to TLS Characterisation} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 5, eval = TRUE ) ``` ## Introduction Tertiary lymphoid structures (TLS) are ectopic lymphoid organs that form in non-lymphoid tissues — most notably in tumours — and are associated with improved patient outcomes and immunotherapy response. **tlsR** provides a fast, reproducible pipeline for detecting TLS and characterising their spatial organisation in multiplexed tissue imaging data (e.g. mIHC, CODEX, IMC). The core pipeline is: ``` Raw ldata list │ ▼ detect_TLS() ← KNN-based B+T co-localisation │ ├──► scan_clustering() ← Optional: local Ripley's L │ ├──► calc_icat() ← ICAT linearity score per TLS │ ├──► detect_tic() ← T-cell clusters outside TLS │ ├──► summarize_TLS() ← Tidy summary table │ └──► plot_TLS() ← Publication-ready spatial plot ``` --- ## Data Format `tlsR` expects a **named list of data frames** (`ldata`), one element per tissue sample. Each data frame must contain at minimum: | Column | Type | Description | |-------------|-----------|--------------------------------------------------| | `x` | numeric | X coordinate in microns | | `y` | numeric | Y coordinate in microns | | `phenotype` | character | Cell label; must contain `"B cell"` / `"T cell"` | Additional columns (e.g. cell area, marker intensities) are silently ignored. ```{r load-data} library(tlsR) data(toy_ldata) # Structure of the built-in example dataset str(toy_ldata) table(toy_ldata[["ToySample"]]$phenotype) ``` --- ## Step 1 — Detect TLS with `detect_TLS()` `detect_TLS()` identifies B-cell-rich regions with sufficient T-cell co-localisation using a KNN density approach. ```{r detect-tls} # Ensure toy data has expected columns for the new validation data(toy_ldata) if (!"phenotype" %in% names(toy_ldata[["ToySample"]])) { toy_ldata[["ToySample"]]$phenotype <- toy_ldata[["ToySample"]]$coarse_phen_vec # or whatever the correct mapping is } ldata <- detect_TLS( LSP = "ToySample", k = 30, # neighbours for density estimation bcell_density_threshold = 15, # min avg 1/k-distance (um) min_B_cells = 50, # min B cells per candidate TLS min_T_cells_nearby = 30, # min T cells within max_distance_T max_distance_T = 50, # search radius (um) ldata = toy_ldata ) table(ldata[["ToySample"]]$tls_id_knn) ``` The new column `tls_id_knn` is `0` for non-TLS cells and a positive integer for cells assigned to TLS 1, 2, 3, … . ### Quick base-R check plot ```{r base-plot, fig.alt="Scatter plot of ToySample cells coloured by TLS membership"} df <- ldata[["ToySample"]] col <- ifelse(df$tls_id_knn == 0, "grey80", c("#0072B2", "#009E73", "#CC79A7")[df$tls_id_knn]) plot(df$x, df$y, col = col, pch = 19, cex = 0.3, xlab = "x (um)", ylab = "y (um)", main = "Detected TLS — ToySample") legend("topright", legend = c("Background", paste0("TLS ", sort(unique(df$tls_id_knn[df$tls_id_knn > 0])))), col = c("grey80", "#0072B2", "#009E73", "#CC79A7"), pch = 19, pt.cex = 1.2, bty = "n") ``` --- ## Step 2 — Local Ripley's L with `scan_clustering()` (Optional) `scan_clustering()` slides a square window across the tissue and tests for statistically significant immune cell clustering using Ripley's L with a Monte Carlo CSR envelope. ```{r scan, eval = FALSE} # eval=FALSE because this step can take ~10–30 s on real data windows <- scan_clustering( ws = 500, # window side (um) sample = "ToySample", phenotype = "B cells", nsim = 39, # Monte Carlo simulations (39 → p < 0.05) plot = FALSE, ldata = ldata ) cat("Significant windows:", length(windows), "\n") # Access the first window's centre and cell count: if (length(windows) > 0) { cat("Centre:", windows[[1]]$window_center, "\n") cat("Cells: ", windows[[1]]$n_cells, "\n") } ``` --- ## Step 3 — ICAT Score with `calc_icat()` The **ICAT (Immune Cell Arrangement Trace)** index quantifies how linearly organised cells are within a TLS. A higher value indicates a more structured (germinal-centre-like) arrangement. ```{r icat} n_tls <- max(ldata[["ToySample"]]$tls_id_knn, na.rm = TRUE) if (n_tls >= 1) { icat_scores <- vapply( seq_len(n_tls), function(id) calc_icat("ToySample", tlsID = id, ldata = ldata), numeric(1) ) names(icat_scores) <- paste0("TLS", seq_len(n_tls)) print(icat_scores) } ``` `calc_icat()` returns `NA` (with a message) if a TLS has too few cells or if FastICA fails to converge — no errors are thrown. --- ## Step 4 — Detect T-cell Clusters with `detect_tic()` T-cell clusters (TIC) that lie *outside* TLS are identified with HDBSCAN. The `min_pts` and `min_cluster_size` arguments let you control sensitivity. ```{r detect-tic} ldata <- detect_tic( sample = "ToySample", min_pts = 10, # HDBSCAN minPts min_cluster_size = 10, # drop clusters smaller than this ldata = ldata ) table(ldata[["ToySample"]]$tcell_cluster_hdbscan, useNA = "ifany") ``` --- ## Step 5 — Summary Table with `summarize_TLS()` `summarize_TLS()` produces a tidy one-row-per-sample summary — convenient for downstream statistical analysis. ```{r summary} sumtbl <- summarize_TLS(ldata, calc_icat_scores = FALSE) print(sumtbl) ``` With `calc_icat_scores = TRUE` a list-column `icat_scores` is appended containing named numeric vectors of per-TLS ICAT values. --- ## Step 6 — Visualise with `plot_TLS()` `plot_TLS()` produces a ggplot2 scatter plot with TLS and TIC coloured distinctly using a colourblind-friendly palette. ```{r plot-tls, fig.alt="ggplot2 spatial map of ToySample with TLS and TIC highlighted"} p <- plot_TLS( sample = "ToySample", ldata = ldata, show_tic = TRUE, point_size = 0.5, alpha = 0.7 ) ``` The returned `ggplot` object can be further customised with standard ggplot2 functions: ```{r plot-custom, fig.alt="Customised TLS plot with dark theme"} library(ggplot2) p + theme_dark() + labs(title = "ToySample — dark theme") ``` --- ## Multi-Sample Workflow `tlsR` is designed to scale naturally to many samples. Simply pass your full `ldata` list and iterate: ```{r multi-sample, eval = FALSE} samples <- names(ldata) ldata <- Reduce(function(ld, s) detect_TLS(s, ldata = ld), samples, ldata) ldata <- Reduce(function(ld, s) detect_tic(s, ldata = ld), samples, ldata) summary_all <- summarize_TLS(ldata) print(summary_all) ``` --- ## Session Info ```{r session} sessionInfo() ```