Question: What are the differences in the pre-built 2020A and 2024A mouse references? Which one should I use?
Answer: The Mouse GRCm39 reference has updated genome FASTA and annotations (GENCODE vM33/Ensembl110 annotations). The below table shows the list of changes. You can download the comprehensive list of updated gene IDs and gene names using the link here. It is recommended to use the latest reference build GRCm39 unless your project has several experiments that were done using the older reference build.
When using different references with Cell Ranger for analysis, we observed variations in the number of cells returned, mapping metrics, and clustering results. Below are screenshots comparing web summaries and Loupe files from the same 5' Mouse Splenocytes dataset using the different references (left - mm2024A, right - mm2020A). Improved and corrected annotations can lead to changes in sensitivity and how cells are grouped.
These differences can be attributed to a few factors:
Changes to Gene Annotations and IDs: The new mouse reference includes updates such as new gene IDs, renamed genes, and the removal of outdated ones. These changes can impact how genes are mapped and classified leading to differences in clustering results due to how the expression profiles differ. For example, 2020A mouse reference includes the locus Gm42418. We excluded rRNA from our reference, however this gene appears to have a biotype conflict where it is also annotated as a lncRNA so it escaped filtering.
Alignment and Mapping: Changes in gene references can affect how reads are mapped to the genome and the algorithm will see some sensitivity to changes in gene references, potentially leading to variations in the reported cells, mapping metrics and the therefore the secondary analysis results.
In summary, differences in mapping metrics and clustering outcomes are expected when using different references. It is important to carefully review the annotations to ensure that no cell types are missed and that the results accurately reflect the cell types present in your dataset.
If there was a specific locus or cell type that was concerning in your analysis, please bring it to our attention by reaching out to support@10xgenomics.com.
Products: Universal 3' Gene Expression, Universal 5' Gene Expression
Last Updated: Jan 2025