Question: How can I build a custom reference for V(D)J using the fetch-imgt tool for non-human/mouse experiments
Answer:
A custom reference for V(D)J can be built by using one of the methods described here.
When working with custom references using the fetch-imgt method for non-human/mouse species, it is common to find customers encounter errors such as, 'None of the C-regions are found in the reference'. Such errors arise due to the missing C-genes in the V(D)J reference for custom species. One of the reasons for the missing C-genes is the incompatibility in version numbers between the two IMGT databases, GENE-DB and LIGM-DB, found on the IMGT website. The fetch-imgt
command (i.e. cellranger-x.y.z/lib/bin/fetch-imgt
) uses a query version number that fetches from LIGM-DB (14.1), which may not have C genes for particular species, while GENE-DB (7.2) may have it for the same species.
ℹ️ This article provides some guidance to build a custom reference using the fetch-imgt
script using Rabbit as an example. It also applies to Rhesus monkey, horse, bovine or
potentially other custom species. For any other species please contact
support@10xgenomics.com.
Example: Building a V(D)J reference for Rabbit
To build a reference for Rabbit (Oryctolagus cuniculus) - then, changes need to be made in the fetch-imgt
script as illustrated below.
We are using c_query=14.1
for all species except mouse for which version is 7.2. This query version leads to no C genes for Rabbit. But if query number 7.2 is used in place of 14.1, C genes for Rabbit are included in the reference. Please see the detailed steps below:
Step 1: Change the below lines in the script using any text editor
if species == "Mus musculus":
c_query2 = "7.2"
else:
c_query2 = "14.1"
to,
if species == "Mus musculus":
c_query2 = "7.2"
elif species == "Oryctolagus cuniculus":
c_query2 = "7.2"
else:
c_query2 = "14.1"
The species must match exactly what you will specify in the fetch-imgt --species
parameter.
Step 2: Run the modified fetch-imgt
to make reference as illustrated here.
Step 3: Ensure to revert back the script changes to its original state.
Optional verification:
One way to check if you need a change in database version is to examine the number of hits in the web URLs listed to the stdout when running fetch-imgt
. For example, for Rabbit (Oryctolagus cuniculus), after executing the default fetch-imgt
script, we see the following line in the stdout related to IGKC.
Downloading http://www.imgt.org/genedb/GENElect?query=14.1+IGKC&species=Oryctolagus_cuniculus to Oryctolagus_cuniculus_14.1_IGKC.html
Checking http://www.imgt.org/genedb/GENElect?query=14.1+IGKC&species=Oryctolagus_cuniculus shows 'Number of results = 0'.
After changing the C-REGION database to 7.2 (i.e. args.species == "Oryctolagus cuniculus":
) following the above Rhesus monkey example, it gives the following with 'Number of results = 13' from the below line.
Downloading http://www.imgt.org/genedb/GENElect?query=7.2+IGKC&species=Oryctolagus+cuniculus to Oryctolaguscuniculus_7.2_IGKC.html
Note: The aforementioned version numbers are from IMGT and are based on this page here. The IMGT APIs vary based on versions and/or species and it was necessitated to fetch reference sequences from different APIs. So you will notice that the version numbers are different for different gene groups. Also depending on the dataset, i.e. TR or IG, one can tweak the version numbers under the function for GENE-DB queries (see below image), for their species of interest to see if they get the C-regions for their dataset.
(from fetch-imgt code, Cell Ranger v7.1.0)
Disclaimer: Modifications to the Cell Ranger code are not officially supported. This code modification is provided as-is for instructional purposes only. 10x Genomics does not support or guarantee the code.
Note: In Cloud platform we do not support running the fetch-imgt command. Users will need to utilize the Cell Ranger command line mode to execute fetch-imgt.
Related KB article: How to create a Cell Ranger compatible V(D)J reference?
Products: Single Cell Immune Profiling
Last Updated: Aug 2023