By Julia Karow
A little less than a year after the full commercial launch of Pacific Biosciences' PacBio RS sequencer, customers and collaborators are using the system for a variety of applications, including de novo assembly of genomes and transcriptomes in combination with other types of sequencing data, variant validation, analyses of trinucleotide repeat sequences, genome methylation analysis, and targeted cancer gene resequencing.
At the Advances in Genome Biology and Technology meeting in Marco Island, Fla., last month, researchers discussed their use of the platform in conference talks, poster presentations, and during a company-organized workshop.
One popular application of PacBio's long single-molecule reads appears to be the de novo assembly of genomes and transcriptomes, usually in combination with other types of sequencing data.
Adam Phillippy's group at the National Biodefense Analysis and Countermeasures Center in Frederick, Md., for example, has developed a hybrid error-correction and de novo assembly method that maps short reads from 454, Illumina, or Ion Torrent to the error-prone PacBio reads, improving their accuracy from about 85 percent to up to 99.9 percent. The team then assembles the corrected PacBio reads using a new version of the Celera assembler (IS 1/24/2012).
Phillippy and his colleagues have applied this strategy to several genomes, including a bacterial genome, which they were able to assemble into a single contig; the genome of a parakeet; and the transcriptome of corn.
Erich Jarvis from the Duke University Medical Center, who collaborated with Phillippy on the parakeet genome, noted that using a combination of 454 data and error-corrected PacBio reads doubled the N50 contig size, compared to other combinations of sequence data that did not use PacBio.
While PacBio data is difficult to use on its own because of its high error rate, Phillippy said, it is powerful in combination with other types of sequence reads. He said he sees future applications for the technology in genome finishing, the assembly of complex eukaryotic genomes, haplotype phasing, and the analysis of mixed samples.
However, the reliability, throughput, accuracy, and cost of the PacBio platform still represent challenges, he noted, as well as the high amount of starting DNA required to construct long-read libraries.
Others have been exploring the PacBio for hybrid de novo assemblies of genomes as well. David Jaffe from the Broad Institute, for example, presented the assembly of near-finished prokaryotic genomes from Illumina and PacBio reads, and researchers from the Department of Energy Joint Genome Institute reported a similar application. Gregory Harhay's team from the US Department of Agriculture's Meat Animal Research Center has explored a combination of PacBio and 454 data to produce reference genome sequences of bacteria.
Read more:
PacBio Users Report Variety of Applications for Single-Molecule Sequencing at AGBT