Question: How can I ensure every cell in my transcript assignments has at least one high-quality transcript when importing third party segmentation for Xenium Ranger?
Rationale: When importing transcript-based segmentations using Xenium Ranger v2.0.1, every cell must have at least one high-quality transcript (Q-Score ≥ 20) assigned to it in the transcript-assignment CSV or else the cell spatial data will be mismatched to the cell gene expression data in Xenium Ranger outs. Xenium Ranger v1.6
and v1.7
do not have the same requirements on imported transcript-based segmentations
Answer: The patch_transcript_assignments.py
python script demonstrated here can be used on transcript assignment files in Baysor-format to remove cells without high-quality transcripts prior to running Xenium Ranger import-segmentation. There are two conditions that could lead to requiring cleaning up transcript assignment inputs:
- Condition 1: Segmentation results contain one or more cell boundary polygons with no transcripts assigned to them. While in principle, transcript-based segmentation methods shouldn’t produce empty cells with zero transcripts, in practice it is possible depending on the method used (such as ProSeg occationally). In this case, those empty cells with no transcripts assigned need to be removed from the GeoJSON before running Xenium Ranger v2.0.1.
- Condition 2: Segmentation results contain one or more cells with no high-quality transcripts (transcripts with Q-Score ≥ 20). This can be avoided by removing low-quality transcripts (i.e., transcripts with Q-score < 20) and non-gene transcripts (e.g., negative control probes) from the transcripts table before running transcript-based segmentation. This filtering step is also best practice so the segmentation algorithm does not use inaccurate and non-biological data. If this filtering was not done, you can use the python script below.
-
Step 1: Download the attachment
patch_transcript_assignments.py
script from the bottom of this article.
-
Step 2: Install numpy and polars. The script was tested using
numpy
version1.23.5
andpolars
version1.2.1
.
pip install numpy pip install polars
-
Step 3: Run
patch_transcript_assignments.py
on your XOA output bundle, transcript assignment CSV (Baysorv0.6
format), and boundary polygon GeoJSON (Baysorv0.6
format).
python patch_transcript_assignments.py --xenium-bundle ../xenium_bundle --transcript-assignment ../baysor-transcript-assignments.csv --viz-polygons ../baysor-cell-polygons.geojson --output-transcript-assignment new-transcript-assignments.csv --output-viz-polygons new-cell-polygons.geojson
- Step 4: Use the new transcript assignment CSV and boundary polygon GeoJSON as inputs to Xenium Ranger import-segmentation.
Script patch_transcript_assignments.py
is attached