Question: Do you do any checking to see that files are uploaded successfully without corruption? One set of FASTQ files I have are large around 50 GB, which might take a while to upload via a browser. I am wondering what happens if the upload gets interrupted.
Answer: When you upload, there is an md5sum check that happens locally and on the cloud before and after, respectively. You may want to check the md5 hash of the file yourself, see article with instructions for Windows/Mac or Linux. The md5sum of the file in the 10x cloud is guaranteed to match the file locally after you upload. If the checksum doesn’t match, then you would receive an error that looks something like this:
IO error in FASTQ file '/mnt/deck-io/inputs/a79zz2c6-55da-4g64-bb06-0884399da76d/862b2ea6-3990-4605-a05z-28c0251v6e11/Sample_S1_L001_R2_001.fastq.gz', line: 516321294: corrupt gzip stream does not have a matching checksum
With large or many files, we recommend using the 10x Cloud CLI tool instead of the web browser uploader since it will attempt to resume upload/download in the case of a failure such as dropped internet connection.
Related articles