What data will I receive for Illumina sequencing? Demultiplexing, Trimming, Filtering

By default you will receive gzip compressed FASTQ data, as individual files  for each sample (demultiplexed).  The demultiplexing is included in the service if you provide us the barcodes sequences on the submission form.
The files will be available for download from our secure SLIMS server.
You will receive only the reads from clusters passing the Illumina quality filter, also called Illumina chastity filter  —  please see detailed info below.  You will find older recommendations on the internet to also analyze reads from clusters that do not pass the chastity filter. These recommendations are outdated.  The filtering is very reliable since several years and it is more or less impossible to find any usable data in the reads that have been filtered out.
Otherwise, the data will be complete. By default we do not trim the sequencing data. We would recommend any quality or adapter trimming to be carried out with third-party tools since they provide better results than the Illumina tools and since there are multiple processing options. SRA submissions also require full-length data.

Please note that the sequence data can contain traces of the Illumina PhiX internal standard. For applications like genome assemblies, these PhiX reads should be removed.  BBduk is a free software to achieve this. Please see the Kmer-filtering paragraph in the BBduk help. Please also see the preprocessing section in this presentation: https://ucdavis-bioinformatics-training.github.io/2017-June-RNA-Seq-Workshop/tuesday/Preprocessing.pdf

The Illumina Chastity Filter:
The Illumina chastity filter is applied only to the first 25 bases of the forward read data per cluster.  The fluorescence intensity ratios are calculated; specifically the chastity value is defined as the ratio of the brightest base intensity divided by the sum of the brightest and second brightest base intensities. Clusters pass the filter if no more than 1 base call has a chastity value below 0.6 in the first 25 cycles.

Please see this FAQ:
When should I trim my Illumina reads and how should I do it?


Category: 06 Sequencing Data

← FAQ
Posted in