Opera: Reconstructing optimal genomic scaffolds with high-throughput paired-end sequences

Song Gao, Niranjan Nagarajan, Wing Kin Sung

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Citations (Scopus)

Abstract

Scaffolding, the problem of ordering and orienting contigs, typically using paired-end reads, is a crucial step in the assembly of high-quality draft genomes. Even as sequencing technologies and mate-pair protocols have improved significantly, scaffolding programs still rely on heuristics, with no gaurantees on the quality of the solution. In this work we explored the feasibility of an exact solution for scaffolding and present a first fixed-parameter tractable solution for assembly (Opera). We also describe a graph contraction procedure that allows the solution to scale to large scaffolding problems and demonstrate this by scaffolding several large real and synthetic datasets. In comparisons with existing scaffolders, Opera simultaneously produced longer and more accurate scaffolds demonstrating the utility of an exact approach. Opera also incorporates an exact quadratic programming formulation to precisely compute gap sizes.

LanguageEnglish
Title of host publicationResearch in Computational Molecular Biology - 15th Annual International Conference, RECOMB 2011, Proceedings
Pages437-451
Number of pages15
DOIs
Publication statusPublished - 4 Apr 2011
Event15th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2011 - Vancouver, BC, Canada
Duration: 28 Mar 201131 Mar 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6577 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other15th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2011
CountryCanada
CityVancouver, BC
Period28/03/1131/03/11

Keywords

  • Fixed-parameter Tractable
  • Genome Assembly
  • Graph Algorithms
  • Scaffolding

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Gao, S., Nagarajan, N., & Sung, W. K. (2011). Opera: Reconstructing optimal genomic scaffolds with high-throughput paired-end sequences. In Research in Computational Molecular Biology - 15th Annual International Conference, RECOMB 2011, Proceedings (pp. 437-451). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6577 LNBI). https://doi.org/10.1007/978-3-642-20036-6_40