中国物理B ›› 2025, Vol. 34 ›› Issue (8): 88704-088704.doi: 10.1088/1674-1056/add508

所属专题: SPECIAL TOPIC — A celebration of the 90th Anniversary of the Birth of Bolin Hao

• • 上一篇    下一篇

CVTree for 16S rRNA: Constructing taxonomy-compatible all-species living tree effectively and efficiently

Yi-Fei Lu(卢逸飞)2, Xiao-Yang Zhi(职晓阳)2,†, and Guang-Hong Zuo(左光宏)1,‡   

  1. 1 Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325001, China;
    2 Yunnan Institute of Microbiology, Key Laboratory of Microbial Diversity in Southwest China of Ministry of Education, School of Life Sciences, Yunnan University, Kunming 650091, China
  • 收稿日期:2025-03-29 修回日期:2025-04-21 接受日期:2025-05-07 出版日期:2025-07-17 发布日期:2025-07-17
  • 通讯作者: Xiao-Yang Zhi, Guang-Hong Zuo E-mail:xyzhi@ynu.edu.cn;ghzuo@ucas.ac.cn
  • 基金资助:
    GHZ thanks theWenzhou Institute, University of Chinese Academy of Sciences (Grant No. WIUCASQD2021042).

CVTree for 16S rRNA: Constructing taxonomy-compatible all-species living tree effectively and efficiently

Yi-Fei Lu(卢逸飞)2, Xiao-Yang Zhi(职晓阳)2,†, and Guang-Hong Zuo(左光宏)1,‡   

  1. 1 Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325001, China;
    2 Yunnan Institute of Microbiology, Key Laboratory of Microbial Diversity in Southwest China of Ministry of Education, School of Life Sciences, Yunnan University, Kunming 650091, China
  • Received:2025-03-29 Revised:2025-04-21 Accepted:2025-05-07 Online:2025-07-17 Published:2025-07-17
  • Contact: Xiao-Yang Zhi, Guang-Hong Zuo E-mail:xyzhi@ynu.edu.cn;ghzuo@ucas.ac.cn
  • Supported by:
    GHZ thanks theWenzhou Institute, University of Chinese Academy of Sciences (Grant No. WIUCASQD2021042).

摘要: The composition vector tree (CVTree) method, developed under the leadership of Professor Hao Bailin, is an alignment-free algorithm for constructing phylogenetic trees. Although initially designed for studying prokaryotic evolution based on whole-genome, it has demonstrated broad applicability across diverse biological systems and gene sequences. In this study, we employed two methods, InterList and Hao, of CVTree to investigate the phylogeny and taxonomy of prokaryote based on the 16S rRNA sequences from All-Species Living Tree Project. We have established a comprehensive phylogenetic tree that incorporates the majority of species documented in human scientific knowledge and compared it with the taxonomy of prokaryotes. And the performance of CVTree was also compared with multiple sequence alignment-based approaches. Our results revealed that CVTree methods achieve computational speeds 1-3 orders of magnitude faster than conventional alignment methods while maintaining high consistency with established taxonomic relationships, even outperforming some multiple sequence alignment methods. These findings confirm CVTree's effectiveness and efficiency not only for whole-genome evolutionary studies but also for phylogenetic and taxonomic investigations based on genes.

关键词: phylogenetic tree, taxonomy, 16S rRNA, ratio of entropy reduction

Abstract: The composition vector tree (CVTree) method, developed under the leadership of Professor Hao Bailin, is an alignment-free algorithm for constructing phylogenetic trees. Although initially designed for studying prokaryotic evolution based on whole-genome, it has demonstrated broad applicability across diverse biological systems and gene sequences. In this study, we employed two methods, InterList and Hao, of CVTree to investigate the phylogeny and taxonomy of prokaryote based on the 16S rRNA sequences from All-Species Living Tree Project. We have established a comprehensive phylogenetic tree that incorporates the majority of species documented in human scientific knowledge and compared it with the taxonomy of prokaryotes. And the performance of CVTree was also compared with multiple sequence alignment-based approaches. Our results revealed that CVTree methods achieve computational speeds 1-3 orders of magnitude faster than conventional alignment methods while maintaining high consistency with established taxonomic relationships, even outperforming some multiple sequence alignment methods. These findings confirm CVTree's effectiveness and efficiency not only for whole-genome evolutionary studies but also for phylogenetic and taxonomic investigations based on genes.

Key words: phylogenetic tree, taxonomy, 16S rRNA, ratio of entropy reduction

中图分类号:  (Sequence analysis)

  • 87.15.Qt
87.18.Wd (Genomics) 87.19.lo (Information theory)