Please also see the: Introduction to PacBio Sequencing and General Information or submit a sample with our submission form.
PacBio Sequel II Sequencing: Services and Performance
We offer complete PacBio library prep, BluePippin size selection and sequencing services. The base calling and secondary analyses like Reads-Of-Insert extraction, HGAP assembly (small genomes), and IsoSeq analyses of single or a small number of SMRT-cells are included in the service. We will provide you the complete data set generated by the PacBio sequencer and SMRT-LINK analyses for download form our servers. The Bioinformatics Core is offering in depth analyses of PacBio sequence data. We now offer HMW-DNA isolation as a service. Due to being a “single-molecule sequencing” technology, the PacBio sequence data quality and yields will depend highly on the quality of the DNA and RNA samples.
The PacBio Sequel II Sequencer
The Sequel II is the third generation PacBio sequencer. It now generates up to 8x more sequencing data per SMRT-cell compared to the Sequel I. The read lengths metrics have im[roved compared to the predecessor and are expected to see another increase within a month. The Sequel chemistry is undergoing rapid development and the read lengths continue to increase. The principles of the single molecule sequencing chemistry are unchanged. The sequence coverage and the sequencing errors do not show any detectable sequence-specific biases.
The latest PacBio chemistry enables High-Fidelity (HiFi) Long Reads: The chemistry significantly increases the average lifespan of the polymerase enabling significantly longer reads which can be used to sequence molecules of 10 kb and 15 kb length with 99.9% accuracy (the coming chemistry will extend this range to 20 kb). Unlike other technologies, PacBio sequencing adapters have a hairpin structure. This allows the repeated circular sequencing of both strands. Read errors can be efficiently removed from the sequences via Circular-Consensus-Sequencing (CCS) analysis, resulting in highest sequencing data quality currently possible. Sequencing four passes of a library molecule is expected to yield Q20 data and nine o rmore passes Q30 data (99.9% accuracy). In contrast to other technologies, PacBio sequencing chemistry is not sensitive to extreme GC contents. Even long GCC repeat stretches can be sequenced. Please see these slides for more details.
The Sequel II:
- enables two sequencing modes for genomic samples: high-fidelity (HiFi) Long Reads and Continuous Long Reads (CLRs).
- HiFi reads improve the accuracy of genome assemblies and Increases the average polymerase read length from ~20kb to ~30 kb (for inserts > 20 kb)
- Increases the polymerase read length further for shorter insert libraries yielding up to 50 Gb yields and average polymerase read length up to 100kb per SMRT-cell.
- 30 Hour Movies: Due to the improved polymerase performance it is now suggested to sequence amplicons over 3kb length, Iso-Seq libraries, and of course high-fidelity long read projects (10 to 15kb inserts) with 20 hour movies instead of the standard 10 hour movies. The gains of the doubled run time for longest insert libraries (>30 kb) are marginal at the moment, but will become interesting with the V2 Express-library kit in 2019.
- Please see the information from PacBio on the new sequencer.
- This page summarizes the most frequent applications for Sequel sequencing as well as the expected required sequencing efforts (number of SMRT-cells required).
Sequel II run times range 10 to 30 hours per SMRT-cell. For high-fidelity long reads the 30 run times are used.
Most Common Applications for PacBio Sequencing
Whole Genome Shotgun Sequencing: Due to the long read lengths (read lengths N50 values can exceed 26 kb) and due to the random nature of the PacBio sequencing errors, PacBio sequencing is the technology enabling the highest quality de novo genome assemblies as a standalone approach. Typically a genome coverage of about 60X is used for de novo assemblies. This is a generalization and experiment will need to be adjusted based on heterozygosity, ploidy, DNA quality, repeat content, etc. . The PacBio data are similarly used to identify structure genomic variants.
Given high quality DNA samples and long insert libraries, mean polymerase-read-lengths of up to15 kb can be expected and mean yields of around 8 Gb per SMRT-cell can be expected after titration. For highest quality genomic DNA samples we do now see average yields of more than 10 Gb, even for plant samples. Subreads can reach lengths of up to 60 kb and subread N50 values can reach up to 26 kb. For bacterial genome sequencing the PacBio data can be analyzed for DNA methylation (the m6A and m4C modifications in prokaryotic DNA).
Iso-Seq (full length transcript RNA-Seq): The PacBio long reads enable sequencing of full-length transcripts up to 10 kb (essentially all transcripts), thus eliminating the need for error-prone transcript assembly from short treads. The Iso-Seq bioinformatics pipeline processes the data into high-quality consensus transcript sequences enabling accurate isoform annotation and open reading frame prediction. These features do make Iso-Seq the method of choice for example for de novo gene annotations. The Sequel chemistry significantly simplifies the Iso-Seq protocol compared to the first generation PacBio sequencer. Iso-Seq libraries can be barcoded and pooled at different stages of the library prep (after the reverse transcription; after PCR-amplification of cDNAs; after PacBio library prep). Please note that the transcript lengths can vary significantly between tissues and sample types. Thus, the rad numbers per sample cannot be predicted accurately and the read numbers can vary significantly between pooled Iso-Seq libraries. Pooling the samples earlier in the library prep process will result in lower costs but also in higher variation of read numbers.
Long Amplicon Sequencing: Please see our FAQ for details. Amplicons of up to 15 kb length can be sequenced for highest quality CCS data. We offer two sets of universal barcodes PCR primer sets (12×12 and 24×24 indices) that you can pick up in the lab (for a fee). These allow the pooling of up to 576 amplicon samples.
The high-fidelity long read approach (10kb to 15 kb CCS sequencing with 30 hour runs) enables hihgest accuracy genome assemblies, haplotype phasing, the analysis of metagenomes, population samples, as well as the assembly of polyploid and highly heterozygous genomes.
Sample Requirements, Sample Submission and Scheduling
PacBio library prep requires microgram DNA sample amounts. The recommended amount of input DNA further correlates with the desired read length – see these PacBio guidelines for SMRTbell libraries for details. Since the PacBio technology interrogates single molecules, any defect (e.g. a nick, an abasic site, a DNA adduct) can interfere with the sequencing process. Thus, the integrity and purity of the DNA sample is of utmost importance. The DNA quality and the DNA amount will determine which library insert sizes are feasible and how many SMRT-cells can be sequenced. The DNA samples should fulfill these criteria:
- Minimal DNA purity: OD 260/280 should be 1.8-2.0; OD 260/230 should be >2.0
- Has undergone a minimum of freeze-thaw cycles.
- Has not been exposed to high temps (> 65°C for more than one hour can cause a detectable decrease in sequence quality).
- Has not been exposed to pH extremes (< 6 or > 9).
- Does not contain insoluble material.
- Is RNA-free.
- Has not been exposed to intercalating fluorescent dyes or ultraviolet radiation.
- Does not contain, divalent metal cations (e.g., Mg2+), denaturants (e.g., guanidinium salts, phenol), or detergents (e.g., heme, humic acid, polyphenols). Amplicon samples should be submitted in EB buffer (EDTA-free).
- Must be double-stranded. Single-stranded DNA cannot be converted into SMRTcell templates but can interfere with polymerase binding.
The sample requirements will vary strongly depending on genome size. Please contact us to discuss your project.
We do offer HMW-DNA isolation as a service. The table shows minimum DNA sample requirements:
One SMRT-seq library can provide sufficient material for the sequencing from as few as two to over 20 SMRT cells – depending on the amount and quality of the starting material and the desired size selection cutoffs. For example, for a 20 kb insert library we would ask for 20 ug high quality DNA to be able to use the BluePippin size selection (and submitting more DNA would be recommended). Lower sample amounts can be used with less stringent size selection, resulting however in reduced average read lengths. DNA samples are best dissolved in TE buffer at a high concentration (~ 100ng/ul or higher) and shipped on blue ice packs or wet ice. Bacterial DNA samples are best isolated from cells in logarithmic growth phase. Generally the sequencing of a single SMRT-cell combined accompanied by BluePippin size selection will be sufficient to assemble bacterial chromosomes as single scaffolds.
As with other single molecule sequencing technologies, the read lengths and the sequencing yields do depend on the nucleic acid sample quality. Every nick or chemical DNA adduct has the potential to abort a read. For difficult DNA samples, especially plant DNA samples with hard-to-remove contaminants (e.g. some polysaccharides), we recommend to carry out a high-salt/phenol/chloroform cleanup (please note that this protocol often leads to a loss of 50% of the sample) or a purification with the BorealGenomics Aurora instrument (discontinued but still available, please inquire). We will QC your sample by Pulsed-Field Gel Electrophoresis before library prep (PFGE; on BioRad CHEF, Pippin Pulse) and spectrometry. A band (not a smear) of 50 kb or longer fragments indicates high integrity DNA samples desired for the generation of long insert size libraries. Please note that spectrometry and PFGE can not fully asses the quality and suitability of the DNA samples since they asses the DNA as double-stranded molecules. For example single-strand nicks and chemical adducts could escape these methods. The QC data however allow to rule out clearly problematic samples.
The final quality assessment of the DNA sample will however be the single molecule sequencing process itself (e.g the average read lengths). Bacterial DNA samples extracted using silica columns will be sheared by the spin columns to fragments of about 20 kb in size. Such bacterial samples tend to generate high quality data and are acceptable. This method is not recommended for eukaryotic samples.
Before submitting samples:
- Please email us a picture of an agarose gel of the sample running also a marker with at least a 20 kb upper band (e.g. GeneRuler 1 kb Plus DNA Ladder or Lambda DNA/HindIII Digest Marker; suggested is a also a lane with undigested Lamdba phage DNA [48kb e.g. NEB N3011S]). Please run the electrophoresis slowly (e.g. at 80V depending on setup).
- Assess sample purity vie spectrophotometry – the 260/280 ratio should be between 1.8 to 2.0 and the 260/230 ratio should be higher than 2.0. PacBio recommends MoBio PowerClean columns for sample cleanup or the high-salt/phenol/chloroform protocol mentioned above if necessary.
- Please use fluorometric methods (e.g. Qubit) for DNA quantification if possible. Measure each sample at least 3 times and accept only reproducible measurements (HMW DNA is often not perfectly dissolved). Spectrophotometry is not reliable for quantifications (especially if the DNA extraction protocol used CTAB).
The DNA samples used for making PacBio libraries must be handled with extreme care – if you need to ship your DNA to our facility, please consult the following PacBio guidelines for shipping and handling. More info and the submission form can be found here. We must receive electronic and print copies of the submission form.
Costs for PacBio sequencing reflect the number of libraries and number of SMRT cells required. Our recharge rates can be viewed here. The listed fees include all labor and reagents. BluePippin size selection is optional (and carries an additional fee), but is highly recommended. Please note our reduced high throughput (HT) recharge rates; these apply if 10 or more SMRT-cells are sequenced per sample or if 6 or more libraries are generated.