PEMapper and PECaller provide a simplified approach to whole-genome sequencing
Date
2017
Type:
Artículo
item.page.extent
10
item.page.accessRights
item.contributor.advisor
ORCID:
Journal Title
Journal ISSN
Volume Title
Publisher
National Academy of Sciences
item.page.isbn
item.page.issn
item.page.issne
item.page.doiurl
item.page.other
item.page.references
Abstract
The analysis of human whole-genome sequencing data presents significant computational challenges. The sheer size of datasets places an enormous burden on computational, disk array, and network resources. Here, we present an integrated computational package, PEMapper/PECaller, that was designed specifically to minimize the burden on networks and disk arrays, create output files that are minimal in size, and run in a highly computationally efficient way, with the single goal of enabling whole-genome sequencing at scale. In addition to improved computational efficiency, we implement a statistical framework that allows for a base by base error model, allowing this package to perform as well or better than the widely used Genome Analysis Toolkit (GATK) in all key measures of performance on human whole-genome sequences.
Description
item.page.coverage.spatial
item.page.sponsorship
Citation
Proc Natl Acad Sci U S A. 2017 Mar 7;114(10):E1923-E1932
Keywords
GATK, SNP calling, genome sequencing, sequence mapping, software