Should I trim adapters from my Illumina reads?
This depends on the objective of your experiments.
In case you are sequencing for counting applications like differential gene expression (DGE) RNA-seq analysis, ChIP-seq, ATAC-seq, read trimming is generally not required anymore when using modern aligners. For such studies local aligners or pseudo-aligners should be used. Modern “local aligners” like STAR, BWA-MEM, HISAT2, will “soft-clip” non-matching sequences. Pseudo-aligners like Kallisto or Salmon will also not have any problem with reads containing adapter sequences.
However, if the data are used for variant analyses, genome annotation or genome or transcriptome assembly purposes, we recommend read trimming, including both, adapter and quality trimming.
How should I adapter trim my Illumina reads?
Paired-end-read sequencing data should be trimmed using algorithms that make use of the paired-end nature to enable the most precise trimming. This mode will not require any knowledge of the adapter sequences.
Recommended tools would be for example these tools in their dedicated paired-end modes: BBduk
. Among these, Skewer
is likely the tool that is the easiest to use.
Trimming of single-end-read
sequencing data requires knowledge of the adapter sequences (please see below). Recommended tools would be Scythe
, and Trimmomatic
, BBduk in their single-end modes
DNA and RNA sequencing:
Truseq forward read: AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
Truseq reverse read: AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
For small RNA/miRNA sequencing data please use this sequence bu also see this FAQ: How should the miRNA/smallRNA data be trimmed?.
TruSeq Small RNA: TGGAATTCTCGGGTGCCAAGG
The counting applications the same considerations as for adapter trimming (above) apply for quality trimming. It can be omitted if using the right aligners.
For other applications, we recommend to combine gentle quality trimming with a threshold quality score of Q15 with a read length filter retaining only reads longer than 35 bp in length.
Williams et al. 2016. Trimming of sequence reads alters RNA-Seq gene expression estimates. BMC Bioinformatics. 2016;17:103. Published 2016 Feb 25. doi:10.1186/s12859-016-0956-2