When should I trim my Illumina reads and how should I do it?

Should I trim adapters from my Illumina reads?
This depends on the objective of your experiments.
In case you are sequencing for counting applications like differential gene expression (DGE) RNA-seq analysis, ChIP-seq, ATAC-seq, read trimming is generally not required anymore when using modern aligners.  For such studies local aligners or pseudo-aligners should be used. Modern “local aligners” like STAR, BWA-MEM, HISAT2, will “soft-clip” non-matching sequences. Pseudo-aligners like Kallisto or Salmon will also not have any problem with reads containing adapter sequences.
However, if the data are used for variant analyses, genome annotation or genome or transcriptome assembly purposes, we recommend read trimming, including both, adapter and quality trimming.

How should I adapter trim my Illumina reads?
Paired-end-read sequencing data should be trimmed using algorithms that make use of the paired-end nature to enable the most precise trimming. This mode will not require any knowledge of the adapter sequences.
Recommended tools would be for example these tools in their dedicated paired-end modes:  BBduk, Skewer, HTStream, FASTP.  Among these, Skewer is likely the tool that is the easiest to use.
Trimming of single-end-read sequencing data requires knowledge of the adapter sequences (please see below). Recommended tools would be Scythe, Cutadapt, and Trimmomatic, HTStream, BBduk in their single-end modes.

DNA and RNA sequencing:
Truseq forward read: AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
Truseq reverse read: AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT

DNA sequencing:
Nextera: CTGTCTCTTATACACATCT

For small RNA/miRNA sequencing data please use this sequence bu also see this FAQ: How should the miRNA/smallRNA data be trimmed?.
TruSeq Small RNA: TGGAATTCTCGGGTGCCAAGG

Please see also this page from Illumina: What sequences do I use for adapter trimming? 
Quality trimming:
The counting applications the same considerations as for adapter trimming (above) apply for quality trimming. It can be omitted if using the right aligners.
For other applications, we recommend to combine gentle quality trimming with a threshold quality score of Q15 with a read length filter retaining only reads longer than 35 bp in length.
Quality trimming tools: e.g. Sickle, Trimmomatic, HTStream, BBduk.
References:
Williams et al. 2016. Trimming of sequence reads alters RNA-Seq gene expression estimates. BMC Bioinformatics. 2016;17:103. Published 2016 Feb 25. doi:10.1186/s12859-016-0956-2

Category: 06 Sequencing Data

← FAQ
Posted in
Latest Tweets
  • Ipipet is/was (?) a very handy tool for aiding manual pipetting in and out of 96-well plates - using tablets as a g… ,
  • The IntelliKin works with array tapes - just like our Intelliqubes. qPCR for 4x 768 samples in a smaller form fact… ,
  • PCR-like DNA amplification with an enzyme cocktail that can be heat-killed. Looks very promising. SHARP amplificati… ,