Appearance
SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.
Literature Information
| DOI | 10.1089/cmb.2012.0021 |
|---|---|
| PMID | 22506599 |
| Journal | Journal of computational biology : a journal of computational molecular cell biology |
| Impact Factor | 1.6 |
| JCR Quartile | Q2 |
| Publication Year | 2012 |
| Times Cited | 13921 |
| Keywords | Genome Assembly, Single-Cell Sequencing, SPAdes Algorithm |
| Literature Type | Journal Article, Research Support, N.I.H., Extramural, Research Support, Non-U.S. Gov't |
| ISSN | 1066-5277 |
| Pages | 455-77 |
| Issue | 19(5) |
| Authors | Anton Bankevich, Sergey Nurk, Dmitry Antipov, Alexey A Gurevich, Mikhail Dvorkin, Alexander S Kulikov, Valery M Lesin, Sergey I Nikolenko, Son Pham, Andrey D Prjibelski, Alexey V Pyshkin, Alexander V Sirotkin, Nikolay Vyahhi, Glenn Tesler, Max A Alekseyev, Pavel A Pevzner |
TL;DR
This research introduces SPAdes, a novel assembler designed for single-cell and multicell genomic data, addressing challenges like non-uniform read coverage and sequencing errors that hinder the assembly of genomes from uncultivated bacteria. SPAdes outperforms existing assemblers, significantly enhancing our ability to generate whole-genome assemblies from uncultured organisms, thus advancing our understanding of microbial diversity beyond traditional metagenomics.
Search for more papers on MaltSci.com
Genome Assembly · Single-Cell Sequencing · SPAdes Algorithm
Abstract
The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.
MaltSci.com AI Research Service
Intelligent ReadingAnswer any question about the paper and explain complex charts and formulas
Locate StatementsFind traces of a specific claim within the paper
Add to KBasePerform data extraction, report drafting, and advanced knowledge mining
Primary Questions Addressed
- What are the specific advantages of SPAdes over other assemblers like E+V-SC, Velvet, and SoapDeNovo in handling single-cell sequencing data?
- How does SPAdes address the challenges of non-uniform read coverage and sequencing errors in single-cell genomics?
- In what ways can the application of SPAdes enhance our understanding of uncultivatable bacteria compared to traditional metagenomic approaches?
- What types of environments or conditions are most conducive to the successful application of SPAdes for single-cell sequencing?
- How does the open-source nature of SPAdes contribute to its adoption and further development in the field of genomics?
Key Findings
Research Background and Purpose
The majority of bacteria in diverse environments cannot be cultured in laboratories, limiting their genomic analysis through traditional sequencing methods. This bottleneck poses challenges for projects like the Human Microbiome Project and antibiotic discovery. The main goal of this research is to introduce SPAdes, a novel genome assembly algorithm that enhances the assembly of single-cell genomic data, particularly from uncultivated organisms, by addressing issues like non-uniform read coverage and sequencing errors.
Main Methods/Materials/Experimental Design
SPAdes employs a unique approach to genome assembly through a series of stages that address specific challenges in single-cell sequencing (SCS). The assembly process can be broken down into four main stages:
Assembly Graph Construction: Utilizes multisized de Bruijn graphs to simplify the assembly graph by removing bulges, tips, and chimeric reads while retaining useful genomic information.
K-bimer Adjustment: This stage involves deriving accurate distance estimates between k-mers by analyzing distance histograms and paths in the assembly graph. This adjustment enhances the accuracy of distance estimates for read pairs.
Paired Assembly Graph: Constructs a graph that incorporates the adjusted k-bimers, facilitating the assembly of contigs that are more reliable than those produced by traditional methods.
Contig Construction: Involves the generation of DNA sequences of contigs and mapping reads to these contigs by backtracking graph simplifications.
Key Results and Findings
- SPAdes demonstrates superior performance in assembling genomes from single-cell data compared to existing assemblers such as E + V-SC, Velvet, and SoapDeNovo.
- The algorithm effectively manages sequencing errors and non-uniform coverage, leading to accurate distance estimates for genomic assembly.
- Benchmarking results show that SPAdes assembled a significant percentage of the E. coli genome with high N50 values and minimal misassemblies.
Main Conclusions/Significance/Innovation
SPAdes represents a significant advancement in the field of genome assembly, particularly for single-cell genomics. Its innovative use of multisized de Bruijn graphs and k-bimer adjustments allows for more accurate and efficient assembly of complex genomic data from uncultivated organisms. This tool is expected to facilitate breakthroughs in understanding microbial communities and advancing fields such as metagenomics and antibiotic discovery.
Research Limitations and Future Directions
While SPAdes improves upon existing methods, it is primarily focused on bacterial genomes. Future research should aim to extend its capabilities to accommodate structural variations in human genomes and other complex organisms. Additionally, further validation across a broader range of datasets will be necessary to assess its robustness and versatility in various genomic contexts.
References
- Single-cell dissection of transcriptional heterogeneity in human colon tumors. - Piero Dalerba;Tomer Kalisky;Debashis Sahoo;Pradeep S Rajendran;Michael E Rothenberg;Anne A Leyrat;Sopheak Sim;Jennifer Okamoto;Darius M Johnston;Dalong Qian;Maider Zabala;Janet Bueno;Norma F Neff;Jianbin Wang;Andrew A Shelton;Brendan Visser;Shigeo Hisamori;Yohei Shimono;Marc van de Wetering;Hans Clevers;Michael F Clarke;Stephen R Quake - Nature biotechnology (2011)
- Short read fragment assembly of bacterial genomes. - Mark J Chaisson;Pavel A Pevzner - Genome research (2008)
- Automated de novo protein sequencing of monoclonal antibodies. - Nuno Bandeira;Victoria Pham;Pavel Pevzner;David Arnott;Jennie R Lill - Nature biotechnology (2008)
- Genome of a low-salinity ammonia-oxidizing archaeon determined by single-cell and metagenomic analysis. - Paul C Blainey;Annika C Mosier;Anastasia Potanina;Christopher A Francis;Stephen R Quake - PloS one (2011)
- HiTEC: accurate error correction in high-throughput sequencing data. - Lucian Ilie;Farideh Fazayeli;Silvana Ilie - Bioinformatics (Oxford, England) (2011)
- Dissecting biological "dark matter" with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth. - Yann Marcy;Cleber Ouverney;Elisabeth M Bik;Tina Lösekann;Natalia Ivanova;Hector Garcia Martin;Ernest Szeto;Darren Platt;Philip Hugenholtz;David A Relman;Stephen R Quake - Proceedings of the National Academy of Sciences of the United States of America (2007)
- Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers. - Paul Medvedev;Son Pham;Mark Chaisson;Glenn Tesler;Pavel Pevzner - Journal of computational biology : a journal of computational molecular cell biology (2011)
- A new algorithm for DNA sequence assembly. - R M Idury;M S Waterman - Journal of computational biology : a journal of computational molecular cell biology (1995)
- Velvet: algorithms for de novo short read assembly using de Bruijn graphs. - Daniel R Zerbino;Ewan Birney - Genome research (2008)
- Crystallizing short-read assemblies around seeds. - Mohammad Sajjad Hossain;Navid Azimi;Steven Skiena - BMC bioinformatics (2009)
Literatures Citing This Work
- SEQuel: improving the accuracy of genome assemblies. - Roy Ronen;Christina Boucher;Hamidreza Chitsaz;Pavel Pevzner - Bioinformatics (Oxford, England) (2012)
- Pathset graphs: a novel approach for comprehensive utilization of paired reads in genome assembly. - Son K Pham;Dmitry Antipov;Alexander Sirotkin;Glenn Tesler;Pavel A Pevzner;Max A Alekseyev - Journal of computational biology : a journal of computational molecular cell biology (2013)
- Genomic sequencing of uncultured microorganisms from single cells. - Roger S Lasken - Nature reviews. Microbiology (2012)
- The future is now: single-cell genomics of bacteria and archaea. - Paul C Blainey - FEMS microbiology reviews (2013)
- Advances for studying clonal evolution in cancer. - Li Ding;Benjamin J Raphael;Feng Chen;Michael C Wendl - Cancer letters (2013)
- Sequence assembly demystified. - Niranjan Nagarajan;Mihai Pop - Nature reviews. Genetics (2013)
- BayesHammer: Bayesian clustering for error correction in single-cell sequencing. - Sergey I Nikolenko;Anton I Korobeynikov;Max A Alekseyev - BMC genomics (2013)
- QUAST: quality assessment tool for genome assemblies. - Alexey Gurevich;Vladislav Saveliev;Nikolay Vyahhi;Glenn Tesler - Bioinformatics (Oxford, England) (2013)
- Nearly finished genomes produced using gel microdroplet culturing reveal substantial intraspecies genomic diversity within the human microbiome. - Michael S Fitzsimons;Mark Novotny;Chien-Chi Lo;Armand E K Dichosa;Joyclyn L Yee-Greenbaum;Jeremy P Snook;Wei Gu;Olga Chertkov;Karen W Davenport;Kim McMurry;Krista G Reitenga;Ashlynn R Daughton;Jian He;Shannon L Johnson;Cheryl D Gleasner;Patti L Wills;Beverly Parson-Quintana;Patrick S Chain;John C Detter;Roger S Lasken;Cliff S Han - Genome research (2013)
- Whole-genome sequences of Chlamydia trachomatis directly from clinical samples without culture. - Helena M B Seth-Smith;Simon R Harris;Rachel J Skilton;Frans M Radebe;Daniel Golparian;Elena Shipitsyna;Pham Thanh Duy;Paul Scott;Lesley T Cutcliffe;Colette O'Neill;Surendra Parmar;Rachel Pitt;Stephen Baker;Catherine A Ison;Peter Marsh;Hamid Jalal;David A Lewis;Magnus Unemo;Ian N Clarke;Julian Parkhill;Nicholas R Thomson - Genome research (2013)
... (13911 more literatures)
© 2025 MaltSci - We reshape scientific research with AI technology
