Short-read (Illumina) Library Construction Services

Illumina libraries can be sequenced on both Illumina as well as Element Biosciences AVITI platforms. As sequencing output increases and experimental scales are growing, generating libraries for sequencing is often the rate-limiting step. Our lab employs liquid-handling robots to minimize sample handling variation and to provide fast turnaround times. Our library preparation services include library QC, library quantification, and library pooling. All our libraries are indexed (have a barcode). We use UDI barcodes (unique -dual-indexing) wherever the protocol allows.
We are happy to discuss the options and protocols suitable for your specific research projects. We prepare standard sequencing libraries as well as many specialized libraries and can also help you with custom projects. At the moment we are offering:

Genomic DNA libraries with different-sized inserts
RNA-seq libraries with w. Poly-A enrichment,strand-specific
RNA-seq libraries with Ribo-depletion,strand-specific
3′-Tag-Seq RNA-seq for gene expression profiling
High-Throughput (HT) library preps (DNA or RNA; starting from 24 libraries; please see below)
miRNA-seq libraries (microRNA)
Mate-Pair libraries
10X-Genomics GemCode (“linked-read”) libraries
Exome capture
ChIP-seq
Methyl-Seq: WGBS (Whole Genome Bisulfite Seq) and RRBS (Reduced Representation Bisulfite Seq)
Reduced Representation libraries (e.g. for Genotyping By Sequencing (GBS) applications)
PCR-free libraries
BluePippin library size selection

We have automated library preparation and can generate up to 96 different barcoded libraries using the Caliper Sciclone G3 robot for consistent quality and rapid turnaround. We can also provide training and access to the robots if you wish to use the instruments yourself for large-scale projects. Please see this page for details on sample packaging and shipping.

DNA/RNA Sample Quantification and Purity

All samples have to be accompanied by appropriate sample QC documentation (e.g. Bioanalyzer traces, Agarose gel electrophoresis images). It speeds up the projects if this QC can be carried out as early as possible in your lab. However, we are happy to run the sample QC for you, for a fee.

The input DNA and RNA quantities specified below and in this table apply if the samples are quantified by a fluorometric method (e.g. Qubit, PicoGreen, RiboGreen). Fluorometry provides advantages in precision and specificity (e.g. DNA dyes will not bind to/measure RNA). If a spectrophotometer (e.g. Nanodrop) is used, we suggest submitting twice the requested amount of sample since this type of measurement is often unreliable. In any case, sample amounts higher than the minimum requirements will improve the library complexity. Spectrophotometer readings are very useful to assess the purity of samples. For DNA samples the 260/280 ratio should be between 1.8 to 2.0 and the 260/230 ratio should be higher than 2.0. For RNA samples the 260/280 ratio should be between 1.8 and 2.1 and 260/230 ratio should be higher than 1.5. Values outside of these ranges indicate contamination. The Real-time PCR Core can carry out DNA as well as RNA extractions.

RNA samples need to be DNA-free and be dissolved in molecular biology grade water (RNAse-free; not DEPC treated; RNAse-free BE buffer is OK).
DNA samples need to be RNA-free (check on agarose gel) and dissolved in EB buffer or TLE buffer (please see bottom of the page). Molecular biology grade water is also OK.

Sample Requirements

Starting material for Illumina library construction can be double-stranded (ds) DNA from any source: genomic DNA, BACs, PCR amplicons, ChIP samples, any type of RNA turned into ds cDNA (mRNA, normalized total RNA, smRNAs), etc. Pretty much anything you can think of that ends up as, or can be turned into, dsDNA. This dsDNA is then fragmented (if it is not already, as in ChIP). The average fragment length should not exceed 600 bp (HiSeq 2500, MiSeq) or 350 bp (HiSeq 3000 &4000). Then the ends are repaired and ‘A’ tailed, adapters are ligated on, size selection is carried out, then PCR performed to generate the final library ready for sequencing. Different library types might vary in the details (such as PCR-free library) but this is the basic workflow. An excellent forum for sequence-related questions of all kinds on all platforms is the Seqanswers.com forum.

High-Throughput Library Preps (HT-Processing)

We are offering high-throughput sequencing library preparations (reduced HT rates starting from 24, 48, and 96 samples) for genomic DNA library preps as well as RNA-seq library preps after both poly-A enrichment or ribo-depletion.
HT processing implies that the samples are submitted ready-to-use and that we can’t pamper individual samples (e.g. purify or concentrate it). In short the suitability of the sample is the responsibility of the lab submitting the samples.

All samples for the high-throughput library preparations (at reduced HT rates) need to be normalized to +- 20% of the average sample concentration.
Please make sure that all your samples do fulfill the sample purity requirements and sample concentration library requirements.
The HT processing rates do not include re-dos of library preps that fail due to sample inadequacies.

Please also see the Comprehensive Sample Requirements Table

DNA Based Libraries

Guidelines for Submission of Library-Worthy DNA
Provide 1 ug or more of high quality DNA (concentration > 50 ng/ul, OD 260/280 close to 1.8; 260/230 ratio >2.0) in EB or TE buffer (EB buffer preferred), or molecular biology grade water. Library construction can also be attempted from less input material, with caveats. If the total input material for library prep is below 100 ng special library prep protocols need to used. For PCR-free libraries sample amounts of 2 ug DNA are recommended; working with less is possible.
DNA samples need to be RNA-free . On an ethidiumbromide stained agarose gel RNA contamination will be visible as a halo-like smear in the range of 50 to 200 bp. Please submit an agarose gel image of the DNA samples together with the samples.

ChIP-seq Libraries
We offer library construction from chromatin immunoprecipitated material. For these more complex experiments, discussions with Core personnel regarding suitability of starting material and construction strategy are recommended. No guarantees are offered with this library service, other than we’ll do our best! The ChIP-Seq Data Technical Note and ChIP-Seq DataSheet from Illumina provide some background information.

We require the ChIP-seq DNA samples to be submitted de-crosslinked and sonicated. The fragment length should best be between 100 and 300 bp. This will result in the tightest peaks for your data. It is recommended to keep the fragment length under 500 bp in any case.

For ChIP-seq we sometimes need to start out with samples that are to low to measure the concentration. Otherwise the general DNA sample recommendations do apply (buffer should be EB buffer or EBT; please see HERE) and more sample is certainly recommended if available.

Please make sure to run the input controls on the bioanalyzer or on an agarose gel beforehand and email us an image of these.
You will also want to sequence one input control per cell line/ sample type.
It is further recommended to verify the enrichment of your regions of interest (e.g. promoter regions) vs. the control samples, before submitting the samples for sequencing by qPCR.

The required read-number per sample will vary from target to target. For the studying point source transcription factors the ENCODE project recommends to analyze at last 20 million (uniquely mapping) reads ( http://genome.cshlp.org/content/22/9/1813.long#boxed-text-2 ). Depending on the quality of your preps perhaps 75% of the reads can be expected to be uniquely mapping. ENCODE tends to err on the high side with their recommendations. Thus, about 20 million reads per sample should be acceptable but this is likely the minimum number.

Mate Pair Libraries
The sequencing of Mate Pair libraries generates long-insert paired-end reads. The libraries are generated by self-ligation of long DNA fragments and labeling of the junction sites to generate chimeric library molecules that bring together sequences that were originally 2kb to 12 kb apart. We are using the Illumina Nextera Mate Pair kit which employs a transposase enzyme to fragment as well as end-tag the DNA in a single step. The tags are biotinylated and thus allow for the selection of junction sites containing fragments. In contrast to older mate pair library protocols, the Nextera kit is very reliable with the exception of the sizing of the initial fragments. As with all other long DNA fragment analyses, the DNA quality matters. Please email us an gel-image before submitting the DNA samples. The samples should run as a band of 20kb size or longer on agarose gels.
The Nextera kit offers two protocols: the “gel-free” version (1 ug input), which is mostly of interest when only little input DNA is available. The sizes of the mates-pair fragments from this protocol usually range from 1.5 kb to 10kb. Surprisingly the SSPACE scaffolder can still work with these data.
The “gel-plus” version requires a minimum of 4 ug input DNA (and 4 times the reagents) and uses gel extractions to size select fragments within a range of +- 700 bp for shorter mates and within +- 2kb for longer mates of up to 10 to 12 kb. Due to the uncertainties of the fragmentation please submit at least twice the amount of sample.
In theory the fragment sizes resulting from the tagmentation are only dependent on the input DNA amount. In praxis the fragment lengths vary considerably between different DNA samples of similar amounts. This variability between samples can be observed even after precise DNA quantification by fluorometry. The reactions are tune-able for aliquots of the same sample, though. Especially if very specific size ranges are desired it is often necessary to repeat the tagmentation reaction with adjusted DNA amounts. We might then combine similar sized gel extraction fractions from two tagmentation reactions to generate libraries of high complexity for the desired size ranges. Please let us know how important specific insert size ranges are for your project.
Because of the difficulties to predict the fragment size ranges, we are quoting mate pair recharge rates including two tagmentation reactions. If we can generate the desired library with a single tagmentation, we charge the lower rate of the one-tagmentation library prep.

Target Enrichment
Numerous companies provide services and platforms that generate whole exome or target amplification. We offer the Fluidigm Access Array, which employs nanofluidics for cost-effective target selection to generate barcoded amplicon libraries that are ready for Illumina sequencing. Sequence Capture Libraries are those in which particular genomic regions are enriched after indexed library generation and sequencing. This strategy allows focused, very deep sequencing and can be implemented for a number of applications. Several companies offer platforms that can generate such material, including Illumina, RainDance, Agilent, NimbleGen, and Fluidigm. Technical information on the Agilent, Nimblegen, RainDance, and Qiagen (which uses a PCR based, not hybridization capture strategy, for enrichment) systems are available but not guaranteed to be up to date. Think of them as a starting point for further investigation (we have the contact info for company reps if needed) and solely informational (no implied endorsement etc.).

RNA-Seq Libraries

Something of a misnomer because all the libraries end up as DNA, but this refers to the starting material. We offer RNA-seq library preparation, with a number of options such as ribo-depletion, poly-A enrichment, 3′-Tag-Seq (QuantSeq) libraries as described below as well as micro-RNA (miRNA) and small RNA library preps.

Guidelines for Submission of Library-Worthy RNA
Provide at least 1 ug (2 – 5 ug preferred) of total RNA at a concentration of at least 50 ng/ul (1 ug for Poly-A enrichment; 2 ug for ribo-depletion libraries; using less starting material is possible, but we can’t guarantee results). Please make sure that your RNA isolation protocol employs a DNAse digestion step or other means to remove DNA from the sample. On an agarose gel, DNA contamination will be visible as a smear of band of fragments considerably larger than the RNA (>10 kb). On the Bioanalyzer RNA-chips DNA will be visible in the size range from 4 to 10 kb. To verify the purity of the RNA samples the 260/280 ratio should be between 1.8 and 2.1 and 260/230 ratio should be higher than 1.5. Poly-A enrichment, ribo-depletion and strand specific library prep are among the commonly requested types of service (more technical details on this appear below). If the RNA quality allows poly-A enrichment is the first choice. Libraries for slightly degraded RNA samples should be prepared using ribo-depletion protocols. Bacterial RNA-seq will always require ribo-depletion. If possible please avoid RNA extraction protocols involving Trizol or related phenol containing reagents (silica column based kits are less likely to retain contaminants). If using Trizol, protocols that contain a column based cleanup (e.g. Direct-zol, TRIzolPlus) are recommended. Please note that an additional column cleanup is mandatory for RNA isolated from blood sample PAXgene or Tempus tubes (for blood sample preservation) or with the accompanying PAXgene and Tempus RNA isolation kits. RNA samples should be eluted in molecular biology grade water, always stored in a -80 degree freezer and shipped on dry ice.
All RNA samples require a Bioanalyzer sample QC (or equivalent). Such QC traces can be submitted by the customers or we can run the QC for a fee instead.
RNA samples need to be DNA-free.

Poly-A Enrichment
Total RNA samples can contain up to 90% ribosomal RNA sequences, which are uninformative for transcriptome or gene expression studies, while mRNAs typically make up only 1 to 2% of total RNA. Thus the enrichment of samples for mRNAs is highly desirable. Poly-A enrichment is the most commonly used method to enrich mRNA sequences from eukaryotic total RNA samples; mRNAs are selected by hybridization to poly-T oligos bound to magnetic beads. This method generates the highest percentage of reads mapping to protein encoding genes and thus is the first choice for most applications. Poly-A enrichment however requires high-quality total RNA samples. We suggest following the recommendations from Illumina – for human/animal samples use total RNA with a bioanalyzer RIN score of 8 or better, for plant material RIN numbers can be lower and tissue-specific (this is mainly a function of the chloroplast content) but should generally be higher than 7.

Ribosomal RNA Depletion
There are multiple commercially available kits to remove ribosomal RNA from your total RNA. The main reason for rRNA depletion is to reduce highly abundant ribosomal RNA especially when transcripts do not carry polyA (bacterial RNA), and also when you desire to retain all long non-coding RNA (lncRNA) and polyA classes of RNA in your sample. Commercial kits containing rRNA removal solution are available for different types of total RNA; they include human, mouse, rat, bacteria (gram positive or negative), plant leaf, plant seed and root, and yeast. Ribo depletion protocols can further enable the analysis of slightly degraded RNA samples (the RIN scores should best be 5 or higher). We ask for at least 2 ug of total RNA for the preparation ribo-depleted libraries. As always libraries can be generated from less material, but the complexity can suffer.

3′-Tag-Seq (QuantSeq)
3’-Tag-Seq is a protocol to generate low-cost and low-noise gene expression profiling data. The protocol is less dependent on RNA sample integrity than poly-A enrichment protocols. More than 48 samples can be sequenced per lane. Please see this FAQ for detailed information. For high-throughput 3’Tag-Seq library generation we require pure total RNA samples at a concentration of 100 ng/ul. For custom 3’-Tag-Seq library preps the input amounts can be a low as 10 ng total. The RNA samples for this protocol need to be isolated or cleaned-up by spin-column protocols. 3-Tag-Seq libraries are sequenced by single-end sequencing on the HiSeq 4000 or the NextSeq.

Micro RNA and Small RNA Libraries
We offer library construction for micro and small RNAs from total RNA using the Illumina protocol and reagents. We size select the libraries with high precision using the Blue Pippin system. The minimum recommended amount of total RNA required for these preps is 100 ng (recommendations for humans samples). Since the total RNA composition can vary widely between tissues and organisms, please aim to provide at least 1 ug of total RNA. Please also take care that you RNA isolation method actually retains micro and small RNAs. The total RNA samples should be submitted in molecular biology grade water at a concentration of 200 ng/ul. High quality RNA is recommended (the total RNA samples should have RIN scores of 8 or higher according to a Bioanalyzer QC) and should have been DNAse treated before sample submission.
We are using the NEXTflex™ Small RNA-Seq kit for the generation of micro RNA and small RNA-seq libraries because it significantly reduces sequence-specific biases during library preparation by employing adapters with randomized ligation junctions. For most applications, the randomized bases should be trimmed before before mapping the reads.

Strand-Specific RNA Libraries
By default we always generate strand-specific RNA-seq libraries. Please let us know if you would prefer the traditional non-stranded library prep instead. Strand-specific (also known as stranded or directional) RNA-seq libraries substantially enhance the value of an RNA-seq experiment. They add information on the originating strand and thus can precisely delineate the boundaries of transcripts in regions with genes on opposite strands, and can determine the transcribed strand of non-coding RNAs. During the cDNA synthesis dUTP is incorporated in the second-stand synthesis. After adapter ligation the dUTP-containing strand is selectively degraded, to preserve strand information for RNA-seq. The forward read of the resulting sequencing data thus represents the “anti-sense strand” and the reverse read the “sense strand” of the genes (for Trinity transcriptome assemblies the “–RF” orientation flag should be used).

Other Library Considerations

PCR-Free Libraries
Libraries generated without amplification will reduce library prep biases. Thus, they can improve the sequencing coverage of genomic areas such as GC-rich regions, promoters, and repeat regions, and enhancing the detection of sequence variants. Please note that PCR-free libraries are more difficult to QC and quantify and that the yields tend to be lower for these libraries compared to amplified libraries (10-15%). PCR-free library prep will also require a greater amount of starting material (>5 fold).

Library Indexing and Pooling
Indexing, also called barcoding, allows for the sequencing of multiple libraries in a single lane, i.e., multiplexing. By default all libraries generated by us have a barcode. Multiplexing is required when the typical lane output of 15-25 million reads from the MiSeq, 120-180 million reads from the HiSeq 2500, or 260-310 million reads from the HiSeq 3000 is greater than required for a single library (e.g., in sequencing BACs, PCR generated fragments, small microbial genomes, transcriptomes, exome, ChIP, and small RNA applications). Multiplexing is also the best way to minimize potential lane-to-lane sequencing variation, as all of your samples are subject to the same sequencing conditions. For example, if you require two sequencing lanes for six samples we recommend 6-plexing and sequencing over two lanes, instead of 3-plexing per lane. The principle is that short nucleotide “barcodes” are appended to each library using specific adapters containing those sequences. Libraries containing different indexed adapters are then constructed, quantified, pooled in equimolar amounts, and sequenced. Deconvoluting the barcodes informatically allows multiple libraries to be sequenced in a single lane at a potential cost and time saving. To date, two methods have been exploited for this: using the commercially available indexing kits (Illumina TruSeq, Nextera, or Bioo Scientific) or synthesizing your own adapter oligos with your own barcodes. With the Illumina TruSeq v2 Library Prep Kits A and B you can use up to 24 different barcodes per kit to multiplex up to 48 libraries. Bioo Scientific offers Illumina-compatible barcodes (NEXTflex) with up to 96 barcodes. The Nextera kit (Epicentre/Illumina) uses dual indexing and transposon mediated fragmentation (‘tagmentation’) followed by PCR amplification to integrate barcoded adapters (so a PCR-free library is not an option using the Nextera kit). The dual indexing/adapter tagging strategy (with up to 12 indices available for adapter 1 and up to eight indices for adapter 2) permits up to 96 unique dual index combinations.

Custom Sequencing Primers Please note that these are used only for a small minority of sequencing projects. Custom sequencing primers need to be submitted at a concentration of 100 uM and a volume of 20 ul each together with the libraries. Please make sure that the sequencing primer design fits the chosen Illumina platform. Miseq and Hiseq platforms use different annealing temperatures.

Scheduling
Once your samples are ready, you should deliver them as soon as possible to get the next available slot in the library prep queue. Please see the Sample Submission and Scheduling page.

Sample/Library Storage Policy
Please let us know if you would like to pick up your samples/libraries after they have been sequenced and we will be happy to accommodate you; otherwise, due to space limitations, they will be stored for only six months after sequencing runs have been completed.

Buffer Compositions

Common buffers for DNA and RNA samples are:
EB-Buffer: 10mM TRIS (pH= 8.0-8.4) – e.g. Qiagen EB Buffer
EBT-Buffer: 10mM TRIS, 0,1%Tween20 (pH=8.0-8.4)
TE-Buffer: 10mM TRIS, 1 mM EDTA (pH=8.0-8.4)
TLE-Buffer: 10mM TRIS, 0.1 mM EDTA (pH=8.0-8.4)