This morning twitter is awash with posts discussing the newly announced nanopore sequencers from Oxford Nanopore. Speculation has been rife for some time about the potential specifications of the first sequencers to be produced by the company, and it certainly appears that the company have fulfilled the expectations placed upon them.
I’m not going to go through the details of the two sequencers announced – others have done a good job of listing the specs here and here, but basically you have machines with either 512, or 2000 nanopores capable of sequencing fragments up to 40kb (but probably several kb routinely) and error rates of around 4%, mostly indels, and with promises of imminent improvements to bring this value down – all at a per-base cost similar to the best of the existing platforms.
Reading through the initial comments about this new platform my first reaction was that we have to get one (or more) of these, but after calming down and thinking about this for a bit I thought I’d have a stab at going through the use cases where this type of sequencer really makes sense.
De-dovo sequence assembly: Oh yes!
The one place where this new platform is a complete no-brainer is if you’re assembling de-novo genome sequence. Whilst Illumina sequencers can give you good coverage depth from paired end reads of around 100bp there are always regions of the genome whose repetitive nature mean that this will not provide enough context to allow a contiguous assembly. Currently you either need to start creating mate-pair libraries, which are notoriously difficult to produce, or you need to get your floor reinforced and stump up a huge amount of cash for a PacBio. The prospect of generating reads of 10kb+ with a simple library prep should be music to your ears, and a 4% error rate with short indels should be easy to work around with a mixed assembly.
Metagenomics: Oh yes!
In the same vein as de-novo assembly the propect of longer reads should make metagenomic studies much easier. Getting more context for your reads should allow you to distinguish between related species much more easily and assembly of mixed bacterial populations should be possible even with the slightly more limited throughput of these sequencers.
I guess the main advantages of the nanopore platform for genotyping is the speed with which it can generate data. Data collection begins almost immediately upon addition of the sample, and real-time monitoring of the data output means that you can immediately stop the run once you have observed all of the variants you are looking for. The long read lengths should allow the illucidation of even the most complex genome re-arrangements. The somewhat high error rates may be problematic, but if these really are mostly indels, then SNP calling might still be practical. The per-base cost means that the current sequencers aren’t yet practical for real time diagnostic use, but future developments on this platform would seem to make this a possibility.
One of the promises of nanopore sequencing was the ability to distinguish modified bases during the base calling process. PacBio have shown that they are able to distinguish hydrox-methyl-cytosine from cytosine, and suggest that identification of methyl-cytosine is theoretically possible. In the reports I’ve seen so far Oxford Nanopore haven’t said anything concrete yet about the ability of their platform to call modified bases, but if this proves to be possible and reliable then this will become an essential bit of kit for labs working on epigentics. The ability not only to directly read modifications directly, but to be able to put these in the context of a multi-kb fragment is truly exciting. The addition of a hairpin structure at the end of a fragment would also allow these sequencers to read both strands of the same fragment, again providing contextual information which has so far been lacking.
It’s possible that the nanopore sequencers may still be of use to epigenetics even without the ability to read modifications directly. Genome wide bisulphite sequencing is already being undertaken on Illumina sequencers, and should be possible on nanopore sequencers, however the bisulphite treatment itself is very harsh, and fragments the DNA sample as it modifies it, so the super-length reads able to be obtained from normal genomic DNA may be elusive once it has been modified.
ChIP-Seq: Not really
The power of ChIP-Seq comes from the number of observations you make, not the length of those observations. The nanopore sequencers seem to be best suited to sequencing fewer, longer fragments which would not be an advantage for ChIP-Seq. There seems no obvious reason why short insert libraries couldn’t be sequenced on a nanopore platform, but at the moment we know very little about the overhead of starting a new sequence on the same nanopore so this may be feasible, but longer read lengths would simply reduce the resolution of the ChIP assay. For some applications it might be interesting to monitor ChIP results in real time, and be able to halt a run once clear peaks had emerged, but in the short term I can’t see this being a good option for this type of experiment.
RNA-Seq: It depends
As with ChIP-Seq, much of the power of RNA-Seq comes from the number of observations which have been made. To make a reasonable measurement of low-expressed transcripts then very large numbers of sequences must be generated, and the existing short read platforms will likely have an advantage in this regard for some time, so for simple quantitation of transcripts the nanopore platform may not offer huge advantages. Where the longer read lengths of the nanopore sequencers will be of use will be in the elucidation and quantitation of splice variants. Current RNA-Seq protocols provide coverage of a very small part of the transcript and often do not provide enough context to determine exactly which splice variant the reads came from. Performing relative quantitation of the splice variants of a gene is therefore not a simple process. Longer reads from a nanopore sequencer could cover the whole length of a transcript removing all doubt about exactly which variant it was. Whether this proves to be a useful tool for expression quantitation will depend on whether the platform is able to generate an unbiased selection of reads (or a selection with a well understood bias) to allow accurate quantitation.
So do I think we should get one of these sequencers? Heck yes! For $900 a piece for the MinIon there’s absolutely no excuse for everyone not to get one and start playing with it to see what it can do. For much of the workload we currently have it may be that this platform isn’t going to revolutionise what we do, but if nothing else it will hopefully spur on the existing manufacturers to push forward the development of their existing platforms. In any case the scientists win. We live in exciting times…