Question: How can I get Xenium probe sequences from the BED file?
Answer: Use bedtools getfasta
and the reference to derive the probe target sequences.
The Xenium custom probe design provides a BED file giving the target genomic coordinates but not the genomic target probe sequences. The pre-designed base panels provide both at https://www.10xgenomics.com/support/in-situ-gene-expression/documentation/steps/panel-design/pre-designed-xenium-gene-expression-panels. We can derive the probe genomic sequences using the BED file, the 2020-A reference and the external tool bedtools
.
- 2020-A reference download https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest
- bedtools https://bedtools.readthedocs.io/en/latest/content/installation.html
The module that enables extracting genomic sequences using BED coordinates is getfasta
. Documentation is at https://bedtools.readthedocs.io/en/latest/content/tools/getfasta.html.
Example command
bedtools getfasta \
-name \
-s \
-split \
-fi ~/Documents/ref/human/refdata-gex-GRCh38-2020-A/fasta/genome.fa \
-bed xenium_human_breast_gene_expression_panel_probe_locations.bed \
> probes.fasta
The -s
flag preserves the strand-orientation information and produces sequences that are antisense, meaning reverse-complementary to the RNA transcript. The -split
flag accounts for gaps in genomic coverage, e.g. as would happen across an intron, such that the probe sequence will be as it is synthesized.
Below are the result from using bedtools v2.31.0.
Products: Xenium Analyzer
Last Modified: 6/28/2023