PacBio Library Prep & Sequencing

Please also see the: Introduction to PacBio Sequencing and General Information

PacBio Sequel Sequencing: Services and Performance

We offer complete PacBio library prep, BluePippin size selection and sequencing services. The base calling and secondary analyses like Reads-Of-Insert extraction, HGAP assembly, and IsoSeq analyses of single or a small number of SMRT-cells are included in the service.  We will provide you the complete data set generated by the PacBio sequencer and SMRT-Portal analyses for download form our servers.  This includes FASTQ files of the filtered subreads as well as “.h5” raw data and intermediate files (bam files for the Sequel).  The Bioinformatics Core is offering in depth analyses of PacBio sequence data.  We now  offer HMW-DNA isolation as a service for animal samples.  Plant samples have to be provided as isolated DNA to the Core lab.  Due to being a “single-molecule sequencing” technology, the PacBio sequence data quality and yields, will depend highly on the quality of the DNA and RNA samples.

The PacBio Sequel Sequencer

The Sequel is the second generation PacBio sequencer.  It generates 7x more reads per SMRT-cell compared to the RSII. The read lengths metrics have fully caught up to its predecessor often surpassing them. The Sequel chemistry is still undergoing rapid development and the read lengths continue to increase.  The principles of the single molecule sequencing chemistry are unchanged. The sequence coverage and the sequencing errors do not show any detectable sequence-specific biases.  Pacific Biosciences has demonstrated a high quality genome assembly for Arabidopis based on the data of only two Sequel SMRT-cells.  The chemistry employed for the Arabidopsis project is now available (11/2016). PacBio advises to run 3 of 4 titration SMRT-cell to optimize the loading of the Sequel.  We find that one cell is now sufficient for most libraries.
Typical Sequel run times are 10 hours per SMRT-cell.  For long-amplicon and Iso-Seq sequencing, the run times can be extended to 20 hours using special and higher-priced SMRT-cells.
Unlike other technologies, PacBio sequencing adapters have a hairpin structure.  This allows the repeated circular sequencing of both strands;  for example accumulating 20 reads of a particular library molecule. Read errors can be very efficiently removed from the sequences via Circular-Consensus-Sequencing (CCS) analysis, resulting in highest sequencing data quality currently possible. In contrast to other technologies, it has been shown that the PacBio sequencing chemistry is not sensitive to extreme GC contents. Even long GCC repeat stretches can be sequenced.

Most Common Applications for PacBio Sequencing

Whole Genome Shotgun Sequencing:  Due to the long read lengths (read lengths N50 values can exceed 26 kb) and due to the random nature of the PacBio sequencing errors, PacBio sequencing is the technology enabling the highest quality de novo genome assemblies as a standalone approach.  Typically a genome coverage of about 60X is used for de novo assemblies.  This is a generalization and experiment will need to be adjusted  based on heterozygosity, ploidy, DNA quality, repeat content, etc. . The PacBio data are similarly used to identify structure genomic variants.
Given high quality DNA samples and long insert libraries, mean polymerase-read-lengths of up to15 kb can be expected and mean yields of around 8 Gb per SMRT-cell can be expected after titration.  For highest quality genomic DNA samples we do now see average yields of more than 10 Gb, even for plant samples. Subreads can reach lengths of up to 60 kb and subread N50 values can reach up to 26 kb.  For bacterial genome sequencing the PacBio data  can be analyzed for DNA methylation (the m6A and m4C modifications in prokaryotic DNA).

Iso-Seq (full length transcript RNA-Seq):     The PacBio long reads enable sequencing of full-length transcripts up to 10 kb (essentially all transcripts), thus eliminating the need for error-prone transcript assembly from short treads. The Iso-Seq bioinformatics pipeline processes the data into high-quality consensus transcript sequences enabling accurate isoform annotation and open reading frame prediction.  These features do make Iso-Seq the method of choice for example for de novo gene annotations.  The Sequel chemistry significantly simplifies the Iso-Seq protocol compared to the first generation PacBio sequencer.

Long Amplicon Sequencing:   Please see our FAQ for details.  Amplicons of up to 8 kb length can be sequenced for highest quality CCS data.  We offer a universal 24×24 index PCR primer set that you can pickup in the lab. It allows the pooling and demultiplexing of up to 576 amplicon samples.

 

Sample Requirements, Sample Submission and Scheduling

PacBio library prep requires microgram DNA sample amounts. The recommended amount of input DNA further correlates with the desired read length – see these PacBio guidelines for SMRTbell libraries for details. Since the PacBio technology interrogates single molecules, any defect (e.g. a nick, an abasic site, a DNA adduct) can interfere with the sequencing process. Thus, the integrity and purity of the DNA sample is of utmost importance. The DNA quality and the DNA amount will determine which library insert sizes are feasible and how many SMRT-cells can be sequenced. The DNA samples should fulfill these criteria:

  • Minimal DNA purity: OD 260/280 should be 1.8-2.0; OD 260/230 should be >2.0
  • Has undergone a minimum of freeze-thaw cycles.
  • Has not been exposed to high temps (> 65°C for more than one hour can cause a detectable decrease in sequence quality).
  • Has not been exposed to pH extremes (< 6 or > 9).
  • Does not contain insoluble material.
  • Is RNA-free.
  • Has not been exposed to intercalating fluorescent dyes or ultraviolet radiation.
  • Does not contain, divalent metal cations (e.g., Mg2+), denaturants (e.g., guanidinium salts, phenol), or detergents (e.g., heme, humic acid, polyphenols). Amplicon samples should be submitted in EB buffer (EDTA-free).
  • Must be double-stranded. Single-stranded DNA cannot be converted into SMRTcell templates but can interfere with polymerase binding.

The sample requirements will vary strongly depending on genome size. Please contact us to discuss your project.
We do offer HMW-DNA isolation as a service.   The table shows minimum DNA sample requirements:

PacBio-requirements-012016-1024x167

One SMRT-seq library can provide sufficient material for the sequencing from as few as two to over 20 SMRT cells – depending on the amount and quality of the starting material and the desired size selection cutoffs. For example, for a 20 kb insert library we would ask for 20 ug high quality DNA to be able to use the BluePippin size selection​ (and submitting more DNA would be recommended). Lower sample amounts can be used with less stringent size selection, resulting however in reduced average read lengths. DNA samples are best dissolved in TE buffer at a high concentration (~ 100ng/ul or higher) and shipped on blue ice packs or wet ice.  Bacterial DNA samples are best isolated from cells in logarithmic growth phase.  Generally the sequencing of a single SMRT-cell combined accompanied by BluePippin size selection will be sufficient to assemble bacterial chromosomes as single scaffolds.
As with other single molecule sequencing technologies, the read lengths and the sequencing yields do depend on the nucleic acid sample quality.  Every nick or chemical DNA adduct has the potential to abort a read. For difficult DNA samples, especially plant DNA samples with hard-to-remove contaminants (e.g. some polysaccharides), we recommend to carry out a high-salt/phenol/chloroform cleanup  (please note that this protocol often leads to a loss of 50% of the sample) or a purification with the BorealGenomics Aurora instrument (please inquire).  We will QC your sample by Pulsed-Field Gel Electrophoresis before library prep (PFGE; on BioRad CHEF, Pippin Pulse) and spectrometry. A band (not a smear) of 50 kb or longer fragments indicates high integrity DNA samples desired for the generation of long insert size libraries.   Please note that spectrometry and PFGE can not fully asses the quality and suitability of the DNA samples since they asses the DNA as double-stranded molecules.  For example single-strand nicks and chemical adducts  could escape these methods.  The QC data however allow to rule out clearly problematic samples.
The final quality assessment of the DNA sample will however be the single molecule sequencing process itself (e.g the average read lengths). Bacterial DNA samples extracted using silica columns will be sheared by the spin columns to fragments of about 20 kb in size. Such bacterial samples tend to generate high quality data and are acceptable. This method is not recommended for eukaryotic samples.

Before submitting samples:

  • Please email us a picture of an agarose gel of the sample running also a marker with at least a 20 kb upper band (e.g. GeneRuler 1 kb Plus DNA Ladder or Lambda DNA/HindIII Digest Marker; suggested is a also a lane with undigested Lamdba phage DNA [48kb  e.g. NEB N3011S]). ​Please run the electrophoresis slowly (e.g. at 80V depending on setup).
  • Assess sample purity vie spectrophotometry  – the 260/280 ratio should be between 1.8 to 2.0 and the 260/230 ratio should be higher than 2.0. PacBio recommends MoBio PowerClean columns for sample cleanup or the high-salt/phenol/chloroform protocol mentioned above if necessary.
  • Please use fluorometric methods (e.g. Qubit) for DNA quantification if possible. Measure each sample at least 3 times and accept only reproducible measurements (HMW DNA is often not perfectly dissolved).  Spectrophotometry is not reliable for quantifications (especially if the DNA extraction protocol used CTAB).

The DNA samples used for making PacBio libraries must be handled with extreme care – if you need to ship your DNA to our facility, please consult the following PacBio guidelines for shipping and handling. More info and the submission form can be found on our Sample Submission & Scheduling page.   We must receive electronic and print copies of the submission form.

Prices

Costs for PacBio sequencing reflect the number of libraries and number of SMRT cells required. Our recharge rates can be viewed here.  The listed fees include all labor and reagents. BluePippin size selection is optional (and carries an additional fee), but is highly recommended. Please note our reduced high throughput (HT) recharge rates; these apply if 10 or more SMRT-cells are sequenced per sample or if 6 or more libraries are generated.

 

Latest Tweets
  • No more "humanized" great ape genome assemblies: Sequencing chimpanzee, and orangutan genomes with PacBio combined… ,
  • Full length RNA-seq for thousands of individual cells. Sequencing 10X Genomics single-cell cDNAs on the Pacbio Seq… ,
  • Linked-read Metagenomes ! Something many labs have been working seems to become feasible. ,
  • RNA-Methylation: "RNA-Seq unveils a previously undefined role for RNA methylation in cancer and shows that the coll… ,
  • Whale shark genome & genome comparisons: "...we identified multiple features that significantly correlate with life… ,