Question: What are the possible sources for chains that are annotated as "Multi" in the contig_annotations.csv
files?
Answer: The chain type for contigs in the filtered_contig_annotations.csv
(or all_contig_annotations.csv
) can sometimes be annotated Multi. For example, see the records in red below:
contig_id | high_confidence | length | chain | v_gene | d_gene | j_gene | c_gene |
AAACCTGCACACTGCG-1_contig_1 | TRUE | 409 | IGK | IGKV1D-39 | None | None | IGKC |
AAACCTGCACACTGCG-1_contig_2 | TRUE | 652 | IGL | IGLV3-10 | None | IGLJ3 | IGLC2 |
AAACCTGCACACTGCG-1_contig_3 | FALSE | 715 | Multi | IGLV10-54 | None | TRAJ12 | None |
AAACCTGCACACTGCG-1_contig_4 | TRUE | 383 | Multi | TRBV14 | None | None | IGHE |
AAACCTGCACACTGCG-1_contig_5 | TRUE | 652 | IGH | IGHV4-4 | IGHD2-15 | IGHJ4 | IGHG1 |
This annotation indicates that the V/D/J/C gene segments came from different chain types for that contig. Mis-annotations by the algorithm and PCR chimeras can both lead to a contig being labeled as Multi.