3′ Tag-Seq is a protocol to generate low-cost and exceptionally low-noise gene expression profiling data. The protocol is also known as TagSeq, 3′ Tag-RNA-Seq, Digital RNA-seq, Quant-Seq 3′ mRNA-Seq, etc. (note that many of these names have previously been used for a variety of other protocols). In contrast to traditional RNA-seq (available here), which generates sequencing libraries for the whole transcript, 3′ Tag-Seq only generates a single initial library molecule per transcript, complementary to 3′ end sequences. For example, for human samples the restriction to a small part of the transcripts reduces the number of sequencing reads required by at least five times. In most cases, more than 48 samples can be sequenced per HiSeq 4000 lane. The vast majority of RNA-seq studies are analyzed exclusively for differential gene expression (DGE). The conventional full transcript RNA-seq protocols generate more data than needed for this specific purpose. The complexity of the standard RNA-seq data is not an advantage if the aim of the project is only DGE analysis; 3′ Tag-Seq might actually be the superior tool for this application (DGE). In our experience the 3′ Tag-Seq data have so far shown exceptionally low noise, as well as insensitivity to RNA sample quality variations. We have successfully applied 3′ Tag-Seq to a wide range of organisms e.g. yeast, amoeba, insects, flatworms, plants, fish, and mammals. Thus far we have encountered problems with only one particular yeast strain.
- We are offering 3′ Tag-Seq as full-service packages (Batch-Tag-Seq, please see below) including library preps, sequencing and optional DGE data analysis. These packages are priced at easy to budget per-sample recharge rates.
- For demanding sample types that require protocol modifications we are offering custom 3′ Tag-Seq sequencing – please inquire.
- A dedicated protocol is available for lowest input samples – please see below.
Advantages of 3’ Tag-Seq:
- low noise gene expression profiling
- less sensitive to RNA sample quality/integrity variations (compared to poly-A enrichment protocols)
- >99% strand-specific; same direction as mRNA transcripts
- requires significantly lower numbers of sequencing reads
- single read sequencing is sufficient
- simpler library prep protocol
- costs about half or less compared to standard RNA-seq
- costs lower than, or comparable to, microarray analysis
- much higher dynamic range compared to microarrays
- we routinely sequence 48 libraries per HiSeq lane; for some applications 96 libraries per lane are possible
- by default UMIs (unique modular identifiers) will be incorporated: especially helpful for very low input or high depth sequencing – please see below
- Batch-Tag-Seq packages: simple pricing scheme and simplified planning of experiments – please see below
Disdavantages of 3’ Tag-Seq:
- data do not contain any transcript-splicing information
- data analysis requires a reference genome with good annotation (also of the UTRs)
- only applicable to eukaryotic samples
- protocol is a (a bit) more sensitive to chemical contaminants (spin column cleaned RNA samples are recommended)
For high-throughput 3′ Tag-Seq library generation we request 15 ul of pure total RNA sample at a concentration of 50 -100 ng/ul. For custom 3′ -Tag-Seq library preps the input amounts can be a low as 10 ng total. The RNA samples for this protocol need to be isolated or cleaned-up by spin-column protocols. Please also see the sample requirements page. 3′ Tag-Seq libraries are sequenced by single-end sequencing on the HiSeq 4000 or the NextSeq. Please note that 3′ Tag-Seq libraries generate lower read numbers on the HiSeq 4000 (about 320 million reads per lane) compared to standard RNA-seq libraries. Since the DGE analysis of 3′ Tag-Seq data requires much lower read numbers this is usually not a problem. Please note that for some analysis pipelines, trimming the first 12 bases from the reads is recommended. We provide the full length data. Trimming is not necessary if you are using a local aligner (like STAR or BBmap). The sequences can be trimmed easily, for example, with the “reformat” command from BBTools. Please also see the 3′ Tag-Seq data analysis recommendations and this note on working with degraded RNA samples.
By default the 3′ Tag-Seq data will include UMIs: the first 6 bases of the forward read represent the UMI, followed by a common linker with the sequence “TATA”, followed by the 12 bp random priming sequence. It is recommended to transfer the UMI sequence information to the read header and trim the first 22 bases from each read with UMI-TOOLS or custom scripts. The same software can be used to remove PCR-duplicates after alignment. Please also see this FAQ on UMIs and data trimming.
This example MDS plot shows an analysis of 3′ Tag-Seq data of macrophage cells exposed to three types of bacterial infections and mock-infections at two time points. The analysis distinguishes the responses to the individual bacterial species and the duration of the infections. Even the reactions to the mock-infections are clustered by time points.
3′ Tag-Seq is a protocol generating exceptionally low-noise and low-cost gene expression profiling data. For more details on the technology, see our FAQs on 3′ Tag-RNA-Seq.
The Batch-Tag-Seq packages include 3′ Tag-RNA-Seq library preparations & sequencing and optional data analysis at low per-sample rates. The differential gene expression (DGE) data analysis is performed by the Bioinformatics Core. For Batch-Tag-Seq we collect samples and process them in larger batches, then sequence the barcoded libraries together on sequencing runs, allowing us to offer affordable recharge rates on a per-sample basis. This also simplifies the budgeting and planning of experiments, since scientists will not have to adjust their experiments to the sequencing capacity of the sequencers.
Batch-Tag-Seq cost for UC system clients is currently $104 per sample, including both library preparation and sequencing (minimum submission is 8 samples).
Please note: In order to enable low cost DGE data analyses and short turnaround times, we will process the Batch-Tag-Seq samples in high throughput fashion. We will not spend time customizing the protocol for individual samples (for example we will not run sample cleanups, sample concentrations, or repeat the library preps or PCR amplification with varying cycle numbers). The 3′ Tag-RNA-Seq library prep protocol is very robust, therefore no problems should be expected as long as the RNA samples fulfill the sample requirements. Please note that the customer is responsible for the sample quality.
Optional Bioinformatics: DGE Analysis
A prerequisite for bioinformatic data analysis is the availability of a well annotated reference genome, including UTRs (to be provided by the customer). The deliverables will include data QC, read-counts-per-gene tables, and DGE analysis for simple comparisons. At least three biological replicate samples need to be sequenced for each condition. Depending on the nature of the samples and the phenotypes, meaningful experiments might require higher numbers of replicates. For details, please inquire with the Bioinformatics Core staff (email@example.com). The DGE analysis service (if selected) will include:
- QC and pre-processing of data
- Mapping to genome reference
- Analysis in Bioconductor/R: normalized read counts
- Include QA/QC metrics from Preprocessing and Mapping
- Single-factor DGE analysis
- MDS plot and any other QA/QC plots from DE analysis
- Differential Expression Table
- Full description of processing method
– 3′ Tag-Seq is only suitable for eukaryotic total RNA samples (A-tailed transcripts).
– Each Batch-Tag-Seq project requires a minimum of 8 samples.
– For each condition at least three biological replicate samples need to be sequenced to allow for a meaningful DGE analysis. As with any other DGE study, statistically meaningful experiments might require higher replicate numbers, depending on the nature of the samples and the phenotypes. Both the DNA Tech Core and the Bioinformatics Core (firstname.lastname@example.org) offer free consultations to discuss such details before starting projects.
– The RNA samples need to be dissolved in molecular biology grade H2O or EB buffer. As always, samples submitted for RNA-seq need to be DNA-free.
– The sample concentration should be determined by fluorometry (e.g. Qubit; plate-reader with Ribo-Green), as spectrometry quantifications (e.g. Nanodrop) are very unreliable.
– To assure the chemical purity of the samples, absorbance ratios should be between 1.8 and 2.1 (260/280 nm ratio) and above 1.5 (260/230 nm ratio).
– The total RNA sample concentration should be between 25 and 100 ng/ul (as quantified by Qubit). Please submit at least 15 ul of each sample. We can work with lower concentrations using custom protocols (see below).
– For custom processing projects we can accept samples with concentrations as low as 4 ng/ul (as quantified by Qubit). We can’t vouch for the results, though.
– If RNA-isolation protocols involving TriZol are used, the RNA should then be purified via a spin column kit (e.g. Zymo RNA clean & concentrate) to remove any solvent traces.
– Customers should provide Bioanalyzer traces (or equivalent). Alternatively we can run RNA sample QC on a Bioanalyzer or LabChip GX for an additional fee.
– 3′ Tag-Seq has a relatively high tolerance for RNA integrity variation. Nevertheless, we do not recommend using RNA samples with a RIN-score lower than 6 for batch processing. We cannot guarantee the successful outcome of the library prep and sequencing data for lower quality samples, or samples without Bioanalyzer QC.
– You will receive around 3 to 6 million reads per sample. For typical experiments, about 2 million reads per sample are required for the DGE analysis of highly- and moderately-expressed genes.
– The availability of a well annotated reference genome (including UTRs; to be provided by the customer) is a prerequisite for bioinformatic data analysis.
Custom 3′ Tag RNA-Seq
In contrast to the batch processing described above, we can adjust the library prep parameters for 3′ Tag-Seq when running custom library preps. The custom protocols can generate usable data for inputs as low as 10 ng total, and also work with degraded RNA samples. Custom protocols do require additional labor. Please inquire with us and provide a description of your samples.
Lowest input 3′ Tag RNA-Seq
For lowest RNA input amount samples (picograms up to 10 ng) we offer an alternative protocol, designed with single-cell transcriptome analysis in mind. Please inquire about the Qiagen UPX protocol 3′ RNA-seq, and provide information about total RNA amounts available, sample numbers, nature of the samples, and the objectives of the project.
Recommended reading in our FAQs:
Which protocols or kits do you recommend for RNA isolations from human and animal samples? How many cells will I need?
Do you have recommendations for the isolation of plant total RNA samples?
How should I purify my samples? How should I remove DNA or RNA contamination?
Where can I find the UMIs in the Tag-Seq data? When and how should I trim my Tag-Seq data? What is the low complexity stretch in the Tag-Seq data?