Title: XOA v3.2 datasets do not require batch correction to be analyzed with previous XOA v3 datasets.
Question: Xenium Onboard Analysis (XOA) v3.2 introduces RNA decoding algorithm changes from XOA v3.0 and XOA v3.1. How do these changes impact transcript detection and what special considerations are needed when combining data run on both versions?
Background:
Changes to transcript decoding algorithms can cause concerns about batch effects when researchers are combining datasets run on different XOA versions for analysis. Beginning from XOA 2.0, Xenium will only introduce algorithm changes that significantly impact transcript results in major versions of XOA. Sometimes, minor and patch versions will include bug fixes to the decoding algorithm as long as they do not significantly alter transcript results and the effect is limited to a small number of genes.
In XOA v3.2, decoding algorithm changes fixed 2 issues:
- Possibility for the lowest raw Q-score bin range to be so small the bin does not contain any negative control codewords, causing the transcript (never observed more than one transcript in this bin) to be calibrated to QV40.
- Incorrect removal of both true positive transcripts when they are detected in the same location.
In addition, XOA v3.2 includes false positive transcripts, which were previously removed for colocalizing with another transcript, in the transcripts.parquet with a Q-score of 0. This does not affect analysis when using QV20+ filtered transcripts in the cell feature matrix. For more details see Overview of Xenium Algorithms.
Answer:
The changes impact a small number of targets, most of which are negative controls. They don’t affect single cell clustering results and batch correction is not necessary when combining data generated from XOA v3.2, XOA v3.1, and XOA v3.0. Below are a couple comparisons of datasets that were analyzed with both XOA v3.2 and XOA v3.0. XOA v3.1 is not shown because its decoding algorithm is identical to XOA v3.0.
The table below shows the fraction change in summary metrics for transcript detection from XOA v3.0 and XOA v3.2, based on transcripts with a QV >= 20.
All sample preparation and software preprocessing were kept equal. The two tested tissue types are as follows:
- Xenium V1 (FFPE Human Ductal Adenocarcinoma): Xenium Human Immuno-Oncology Profiling Panel (380 genes)
- Xenium Prime (FFPE Human Pancreas): Xenium Prime 5K Human Pan Tissue & Pathways Panel (pre-designed panel) + 5 add-on genes = 5,006 genes
Chemistry version |
Decoded QV >= 20 transcripts (Median transcripts per cell) |
Fraction change in transcripts | Fraction median gain for genes | Fraction change in Negative control probe counts per control per cell | Fraction change in Estimated number of false positive transcripts per cell | |
XOA v3.0 | XOA v3.2 | |||||
Xenium V1 | 19,006,751(42) | 19,067,126(42) | 1.003 | 1 | 1.027 | 1.023 |
Xenium Prime | 57,874,538(92) | 57,888,790(92) | 1.000 | 1 | 1.009 | 1 |
Figure 1. Decoded QV >= 20 Transcript Counts – Pseudo-Bulk Comparison
Each target with decoded QV >= 20 transcripts is represented by a dot, with XOA v3.0 counts on the x-axis and XOA v3.2 counts on the y-axis. Both axes are log-normalized and colored by codeword categories.
Xenium V1: XOA v3.0 vs XOA v3.2 (left) | Xenium Prime: XOA v3.0 vs XOA v3.2 (right)
Figure 2. UMAP Projection and Clustering
The changes in the decoding algorithm introduce minor differences in transcript counts. To assess batch effects, we projected PCA-reduced XOA v3.0 and XOA v3.2 data into the same UMAP space and performed clustering using nearest neighbors. Cells with no transcripts were filtered out, and default Seurat parameters were used.
Xenium V1: XOA v3.0 vs XOA v3.2
Xenium Prime: XOA v3.0 vs XOA v3.2