Pacific Biosciences (PacBio) has recently published a high quality de novo genome assembly for Arabidopsis based on the data of only two Sequel SMRT-cells. The Sequel is the second generation PacBio sequencer and generates up to 7x more reads per SMRT-cell compared to the first generation PacBio RSII sequencer. The DNA Technologies Core is now offering sequencing on the Sequel, using the chemistry employed for the mentioned Arabidopsis project, in addition to sequencing on the RSII.
Whereas the footprint of the Sequel sequencer is only a third of the of the RSII, the size and the surface area of Sequel SMRT-cells has grown (the golden colored surface shown in the images below) allowing to increase the number of zero-mode-waveguides (ZMWs) from 150,000 to 1 million per cell. Further the new SMRT-cells integrate optics, filters, as well as the camera sensor into each SMRT-cell.
Sequel read lengths are currently shorter than what can be achieved on the RSII (please see below). However, pending chemistry improvements in the next quarter are expected to extend Sequel read lengths closer to that of the RSII. The principles of the single molecule sequencing chemistry remain unchanged. As before, the sequence coverage and the sequencing errors do not show any detectable sequence-specific biases. Given high quality DNA samples and long insert libraries, mean polymerase-read-lengths of 10 kb can be expected and yields of around 5 Gb per SMRT-cell can be expected after titration. Subreads can reach lengths of up to 35 kb.
Please note that currently several of the run metrics provided by the Sequel software are still quite imprecise. The important P1 percentage metric however can be relied on. Accurate polymerase read length and subread lengths number will require alignment to a reference genome and running a “Resequencing Analysis”. Thus, all the PacBio read metrics mentioned below are based on E. coli libraries and thus might not reflect results with eukaryotic samples. PacBio advises currently to run 3 of 4 titration SMRT-cells to optimize the loading of the Sequel. We expect that the number of required titration cells, similar to the RSII, can be reduced once all the run metrics are accurate and once we have gained more experience with the new sequencer.
The sample requirements and the library prep protocols are essentially the same as for the RSII (requiring one additional column filtration).
Please note that PacBio sequencing metrics can always vary from reagent batch to reagent batch and due to the sample quality. Based on our initial data and assuming high quality libraries, the costs and metrics of the Sequel and the RSII seem to compare currently like follows: The Sequel read numbers seem in practice to be 5 to 6 times higher than the RSII numbers. The SMRT-cell and reagent costs for the Sequel are about 4 times higher compared to the RSII. However, due to reduced labor our recharge rates for sequencing a Sequel SMRT-cell will likely be only about 2.8 times higher (excluding library preps). The number of sequenced bases can be about 3.5 times higher.
The longest polymerase reads of the Sequel will reach about 60% the length (43 kb) of the RSII (~70 kb) for 6 hour runs. The N50 length of the polymerase reads seem to currently reach up to 63% (15kb) of the best genomic sample RSII metrics (24kb).
Please note that our RSII, for most samples, significantly outperforms PacBio’s specifications. For high quality DNA samples the yields range mostly between 1.1 and 1.6 Gb per SMRT-cell (after titration). In contrast PacBio’s official yield estimates are 500 Mb to 1 Gb per SMRT-cell
When to use the Sequel and when to use the RSII? Based on the metrics above, the current Sequel chemistry seems to be especially suited for Iso-Seq protocols (whole transcript sequencing), amplicon sequencing (amplicons up to 2 kb?), and metagenomic applications (16S rRNA profiling and shotgun sequencing). Further, the Sequel will be the instrument of choice when speedy and high-throughput long-read sequencing is required. Due to the significant advantages conveyed by the subset of the longest sequencing reads, the RSII sequencer is likely the preferable instrument for most genome-wide studies of large eukaryotic genomes, especially for de novo genome assemblies. Economical bacterial genome sequencing on the Sequel will require library barcoding and multiplexing since a single Sequel SMRT-cell generates data sufficient for multiple bacterial genomes. However the demultiplexing of PacBio data is currently accompanied by significant data loss, making this option less appealing.
This assessment might be changing with the next Sequel chemistry and as we gain expertise with the new sequencer. We will keep you informed. Let us know if we can help with any PacBio questions.