OPERA-LG: Efficient and exact scaffolding of large, repeat-rich eukaryotic genomes with performance guarantees

Song Gao, Denis Bertrand, Burton K H Chia, Niranjan Nagarajan

Research output: Contribution to journalArticle

24 Citations (Scopus)

Abstract

The assembly of large, repeat-rich eukaryotic genomes represents a significant challenge in genomics. While long-read technologies have made the high-quality assembly of small, microbial genomes increasingly feasible, data generation can be expensive for larger genomes. OPERA-LG is a scalable, exact algorithm for the scaffold assembly of large, repeat-rich genomes, out-performing state-of-the-art programs for scaffold correctness and contiguity. It provides a rigorous framework for scaffolding of repetitive sequences and a systematic approach for combining data from different second-generation and third-generation sequencing technologies. OPERA-LG provides an avenue for systematic augmentation and improvement of thousands of existing draft eukaryotic genome assemblies.

LanguageEnglish
Article number102
JournalGenome Biology
Volume17
Issue number1
DOIs
Publication statusPublished - 11 May 2016
Externally publishedYes

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Genetics
  • Cell Biology

Cite this

Gao, Song ; Bertrand, Denis ; Chia, Burton K H ; Nagarajan, Niranjan. / OPERA-LG : Efficient and exact scaffolding of large, repeat-rich eukaryotic genomes with performance guarantees. In: Genome Biology. 2016 ; Vol. 17, No. 1.
@article{89ce147d863b4649b70048301034ae97,
title = "OPERA-LG: Efficient and exact scaffolding of large, repeat-rich eukaryotic genomes with performance guarantees",
abstract = "The assembly of large, repeat-rich eukaryotic genomes represents a significant challenge in genomics. While long-read technologies have made the high-quality assembly of small, microbial genomes increasingly feasible, data generation can be expensive for larger genomes. OPERA-LG is a scalable, exact algorithm for the scaffold assembly of large, repeat-rich genomes, out-performing state-of-the-art programs for scaffold correctness and contiguity. It provides a rigorous framework for scaffolding of repetitive sequences and a systematic approach for combining data from different second-generation and third-generation sequencing technologies. OPERA-LG provides an avenue for systematic augmentation and improvement of thousands of existing draft eukaryotic genome assemblies.",
author = "Song Gao and Denis Bertrand and Chia, {Burton K H} and Niranjan Nagarajan",
year = "2016",
month = "5",
day = "11",
doi = "10.1186/s13059-016-0951-y",
language = "English",
volume = "17",
journal = "Genome biology",
issn = "1465-6906",
publisher = "BioMed Central",
number = "1",

}

OPERA-LG : Efficient and exact scaffolding of large, repeat-rich eukaryotic genomes with performance guarantees. / Gao, Song; Bertrand, Denis; Chia, Burton K H; Nagarajan, Niranjan.

In: Genome Biology, Vol. 17, No. 1, 102, 11.05.2016.

Research output: Contribution to journalArticle

TY - JOUR

T1 - OPERA-LG

T2 - Genome biology

AU - Gao, Song

AU - Bertrand, Denis

AU - Chia, Burton K H

AU - Nagarajan, Niranjan

PY - 2016/5/11

Y1 - 2016/5/11

N2 - The assembly of large, repeat-rich eukaryotic genomes represents a significant challenge in genomics. While long-read technologies have made the high-quality assembly of small, microbial genomes increasingly feasible, data generation can be expensive for larger genomes. OPERA-LG is a scalable, exact algorithm for the scaffold assembly of large, repeat-rich genomes, out-performing state-of-the-art programs for scaffold correctness and contiguity. It provides a rigorous framework for scaffolding of repetitive sequences and a systematic approach for combining data from different second-generation and third-generation sequencing technologies. OPERA-LG provides an avenue for systematic augmentation and improvement of thousands of existing draft eukaryotic genome assemblies.

AB - The assembly of large, repeat-rich eukaryotic genomes represents a significant challenge in genomics. While long-read technologies have made the high-quality assembly of small, microbial genomes increasingly feasible, data generation can be expensive for larger genomes. OPERA-LG is a scalable, exact algorithm for the scaffold assembly of large, repeat-rich genomes, out-performing state-of-the-art programs for scaffold correctness and contiguity. It provides a rigorous framework for scaffolding of repetitive sequences and a systematic approach for combining data from different second-generation and third-generation sequencing technologies. OPERA-LG provides an avenue for systematic augmentation and improvement of thousands of existing draft eukaryotic genome assemblies.

UR - http://www.scopus.com/inward/record.url?scp=84977499672&partnerID=8YFLogxK

U2 - 10.1186/s13059-016-0951-y

DO - 10.1186/s13059-016-0951-y

M3 - Article

VL - 17

JO - Genome biology

JF - Genome biology

SN - 1465-6906

IS - 1

M1 - 102

ER -