Appearance
SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.
文献信息
| DOI | 10.1089/cmb.2012.0021 |
|---|---|
| PMID | 22506599 |
| 期刊 | Journal of computational biology : a journal of computational molecular cell biology |
| 影响因子 | 1.6 |
| JCR 分区 | Q2 |
| 发表年份 | 2012 |
| 被引次数 | 13921 |
| 关键词 | 基因组组装, 单细胞测序, SPAdes算法, 微生物组, 开源软件 |
| 文献类型 | Journal Article, Research Support, N.I.H., Extramural, Research Support, Non-U.S. Gov't |
| ISSN | 1066-5277 |
| 页码 | 455-77 |
| 期号 | 19(5) |
| 作者 | Anton Bankevich, Sergey Nurk, Dmitry Antipov, Alexey A Gurevich, Mikhail Dvorkin, Alexander S Kulikov, Valery M Lesin, Sergey I Nikolenko, Son Pham, Andrey D Prjibelski, Alexey V Pyshkin, Alexander V Sirotkin, Nikolay Vyahhi, Glenn Tesler, Max A Alekseyev, Pavel A Pevzner |
一句话小结
本研究提出了一种新型单细胞基因组组装工具SPAdes,旨在解决传统宏基因组技术无法克隆未培养细菌的问题。SPAdes在组装精度和覆盖度上优于现有的单细胞和多细胞组装器,能够提供更丰富的不可培养细菌基因组信息,具有重要的研究意义。
在麦伴科研 (maltsci.com) 搜索更多文献
基因组组装 · 单细胞测序 · SPAdes算法 · 微生物组 · 开源软件
摘要
在各种环境中,大多数细菌无法在实验室中克隆,因此无法使用现有技术进行测序。单细胞基因组学的一个主要目标是用未培养生物的全基因组组装来补充以基因为中心的宏基因组数据。单细胞数据的组装具有挑战性,因为其读数覆盖度高度不均匀,并且测序错误和嵌合读数的水平较高。我们描述了一种新的组装工具SPAdes,适用于单细胞和标准(多细胞)组装,并证明它在最近发布的专门针对单细胞数据的E+V-SC组装器以及流行的多细胞数据组装器Velvet和SoapDeNovo的基础上有所改进。SPAdes生成单细胞组装,提供关于不可培养细菌基因组的信息,这些信息大大超过传统宏基因组研究所能获得的结果。SPAdes可以在网上获取(http://bioinf.spbau.ru/spades),并作为开源软件发布。
英文摘要
The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.
麦伴智能科研服务
主要研究问题
- SPAdes在处理高错误率和非均匀读覆盖方面的具体技术创新是什么?
- 除了SPAdes,还有哪些新兴的基因组组装算法在单细胞测序中显示出潜力?
- 如何评估SPAdes在组装不可培养细菌基因组方面的准确性和可靠性?
- SPAdes与传统的宏基因组学方法相比,在数据解读和应用上有什么优势?
- 在单细胞基因组组装中,SPAdes的使用对后续分析(如功能基因组学)有何影响?
核心洞察
研究背景和目的
在生物医学研究中,许多细菌无法在实验室中培养,因此无法使用现有技术进行测序。单细胞基因组学的主要目标是补充基因中心的宏基因组数据,通过对未培养生物体的全基因组组装来解决这一问题。本研究旨在提出一种新的基因组组装算法SPAdes,以提高单细胞和多细胞组装的效率和准确性。
主要方法/材料/实验设计
SPAdes的设计包括以下四个阶段,旨在解决单细胞测序中的主要挑战,如测序错误、不均匀覆盖、插入大小变异和嵌合读数。
- 组装图构建:使用多大小的de Bruijn图,进行新算法的图简化,包括去除突起和嵌合读数。
- k-bimer调整:通过联合分析距离直方图和路径来推导基因组中k-mers之间的准确距离估计。
- 配对组装图构建:基于调整后的h-biedges构建配对组装图。
- Contig构建:生成DNA序列的contigs,并通过回溯图简化来映射读数。
关键结果和发现
- SPAdes在单细胞和多细胞数据集上进行了基准测试,显示出优于现有的组装器(如E + V-SC、Velvet和SOAPdenovo)的性能。
- 在单细胞E. coli数据集(ECOLI-SC)上,SPAdes组装了约96.1%的基因组,N50为49623 bp,且仅有一处误组装。
- SPAdes成功捕获了比E + V-SC多100个E. coli基因。
主要结论/意义/创新性
SPAdes展示了在单细胞基因组组装中,结合新的算法设计和配对读数信息可以显著提高组装质量。它不仅提高了对未培养细菌基因组的理解,还为单细胞基因组学的未来研究提供了重要工具。
研究局限性和未来方向
尽管SPAdes在单细胞组装中表现优异,但仍面临一些局限性,如对更复杂数据集的适应性。未来的研究方向包括扩展SPAdes以处理人类单细胞项目中的结构变异,并探索更复杂的单细胞基因组数据的分析方法。
参考文献
- Single-cell dissection of transcriptional heterogeneity in human colon tumors. - Piero Dalerba;Tomer Kalisky;Debashis Sahoo;Pradeep S Rajendran;Michael E Rothenberg;Anne A Leyrat;Sopheak Sim;Jennifer Okamoto;Darius M Johnston;Dalong Qian;Maider Zabala;Janet Bueno;Norma F Neff;Jianbin Wang;Andrew A Shelton;Brendan Visser;Shigeo Hisamori;Yohei Shimono;Marc van de Wetering;Hans Clevers;Michael F Clarke;Stephen R Quake - Nature biotechnology (2011)
- Short read fragment assembly of bacterial genomes. - Mark J Chaisson;Pavel A Pevzner - Genome research (2008)
- Automated de novo protein sequencing of monoclonal antibodies. - Nuno Bandeira;Victoria Pham;Pavel Pevzner;David Arnott;Jennie R Lill - Nature biotechnology (2008)
- Genome of a low-salinity ammonia-oxidizing archaeon determined by single-cell and metagenomic analysis. - Paul C Blainey;Annika C Mosier;Anastasia Potanina;Christopher A Francis;Stephen R Quake - PloS one (2011)
- HiTEC: accurate error correction in high-throughput sequencing data. - Lucian Ilie;Farideh Fazayeli;Silvana Ilie - Bioinformatics (Oxford, England) (2011)
- Dissecting biological "dark matter" with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth. - Yann Marcy;Cleber Ouverney;Elisabeth M Bik;Tina Lösekann;Natalia Ivanova;Hector Garcia Martin;Ernest Szeto;Darren Platt;Philip Hugenholtz;David A Relman;Stephen R Quake - Proceedings of the National Academy of Sciences of the United States of America (2007)
- Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers. - Paul Medvedev;Son Pham;Mark Chaisson;Glenn Tesler;Pavel Pevzner - Journal of computational biology : a journal of computational molecular cell biology (2011)
- A new algorithm for DNA sequence assembly. - R M Idury;M S Waterman - Journal of computational biology : a journal of computational molecular cell biology (1995)
- Velvet: algorithms for de novo short read assembly using de Bruijn graphs. - Daniel R Zerbino;Ewan Birney - Genome research (2008)
- Crystallizing short-read assemblies around seeds. - Mohammad Sajjad Hossain;Navid Azimi;Steven Skiena - BMC bioinformatics (2009)
引用本文的文献
- SEQuel: improving the accuracy of genome assemblies. - Roy Ronen;Christina Boucher;Hamidreza Chitsaz;Pavel Pevzner - Bioinformatics (Oxford, England) (2012)
- Pathset graphs: a novel approach for comprehensive utilization of paired reads in genome assembly. - Son K Pham;Dmitry Antipov;Alexander Sirotkin;Glenn Tesler;Pavel A Pevzner;Max A Alekseyev - Journal of computational biology : a journal of computational molecular cell biology (2013)
- Genomic sequencing of uncultured microorganisms from single cells. - Roger S Lasken - Nature reviews. Microbiology (2012)
- The future is now: single-cell genomics of bacteria and archaea. - Paul C Blainey - FEMS microbiology reviews (2013)
- Advances for studying clonal evolution in cancer. - Li Ding;Benjamin J Raphael;Feng Chen;Michael C Wendl - Cancer letters (2013)
- Sequence assembly demystified. - Niranjan Nagarajan;Mihai Pop - Nature reviews. Genetics (2013)
- BayesHammer: Bayesian clustering for error correction in single-cell sequencing. - Sergey I Nikolenko;Anton I Korobeynikov;Max A Alekseyev - BMC genomics (2013)
- QUAST: quality assessment tool for genome assemblies. - Alexey Gurevich;Vladislav Saveliev;Nikolay Vyahhi;Glenn Tesler - Bioinformatics (Oxford, England) (2013)
- Nearly finished genomes produced using gel microdroplet culturing reveal substantial intraspecies genomic diversity within the human microbiome. - Michael S Fitzsimons;Mark Novotny;Chien-Chi Lo;Armand E K Dichosa;Joyclyn L Yee-Greenbaum;Jeremy P Snook;Wei Gu;Olga Chertkov;Karen W Davenport;Kim McMurry;Krista G Reitenga;Ashlynn R Daughton;Jian He;Shannon L Johnson;Cheryl D Gleasner;Patti L Wills;Beverly Parson-Quintana;Patrick S Chain;John C Detter;Roger S Lasken;Cliff S Han - Genome research (2013)
- Whole-genome sequences of Chlamydia trachomatis directly from clinical samples without culture. - Helena M B Seth-Smith;Simon R Harris;Rachel J Skilton;Frans M Radebe;Daniel Golparian;Elena Shipitsyna;Pham Thanh Duy;Paul Scott;Lesley T Cutcliffe;Colette O'Neill;Surendra Parmar;Rachel Pitt;Stephen Baker;Catherine A Ison;Peter Marsh;Hamid Jalal;David A Lewis;Magnus Unemo;Ian N Clarke;Julian Parkhill;Nicholas R Thomson - Genome research (2013)
... (13911 更多 篇文献)
© 2025 MaltSci 麦伴科研 - 我们用人工智能技术重塑科研
