Question: How can I get Xenium probe sequences from the BED file?
Answer: For older standard add-on designs, the instructions allow deriving the target region sequences from the BED file.
The Support site now also provides the genomic target probe sequences (FASTA) alongside the BED file of genomic coordinates for pre-designed base panels here for Xenium V1 and here for Xenium Prime. For more recent standard add-on designs, the probe_info.csv.gz will provide the genomic target region sequences. For advanced custom target sequences, see the custom_sequences.csv.
The Xenium custom probe design provides a BED file giving the target genomic coordinates We can derive the probe genomic sequences using the BED file, the 2020-A reference and the external tool bedtools
. Use bedtools getfasta
and the reference to derive the probe target sequences.
- 2020-A reference download https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest
- bedtools https://bedtools.readthedocs.io/en/latest/content/installation.html
The module that enables extracting genomic sequences using BED coordinates is getfasta
. Documentation is at https://bedtools.readthedocs.io/en/latest/content/tools/getfasta.html.
Example command
bedtools getfasta \
-name \
-s \
-split \
-fi ~/Documents/ref/human/refdata-gex-GRCh38-2020-A/fasta/genome.fa \
-bed xenium_human_breast_gene_expression_panel_probe_locations.bed \
> probes.fasta
The -s
flag preserves the strand-orientation information and produces sequences that are antisense, meaning reverse-complementary to the RNA transcript. The -split
flag accounts for gaps in genomic coverage, e.g. as would happen across an intron, such that the probe sequence will be as it is synthesized.
Below are the result from using bedtools v2.31.0.
Products: Xenium Analyzer
Last Modified: 8/29/2024