中国物理B ›› 2015, Vol. 24 ›› Issue (12): 128202-128202.doi: 10.1088/1674-1056/24/12/128202
• SPECIAL TOPIC—8th IUPAP International Conference on Biological Physics • 上一篇 下一篇
于家峰a b, 隋天翔a d, 王红梅c, 王春玲c, 荆莉c, 王吉华a c
Yu Jia-Feng (于家峰)a b, Sui Tian-Xiang (隋天翔)a d, Wang Hong-Mei (王红梅)c, Wang Chun-Ling (王春玲)c, Jing Li (荆莉)c, Wang Ji-Hua (王吉华)a c
摘要: Agrobacterium tumefaciens strain C58 is a type of pathogen that can cause tumors in some dicotyledonous plants. Ever since the genome of A. tumefaciens strain C58 was sequenced, the quality of annotation of its protein-coding genes has been queried continually, because the annotation varies greatly among different databases. In this paper, the questionable hypothetical genes were re-predicted by integrating the TN curve and Z curve methods. As a result, 30 genes originally annotated as “hypothetical” were discriminated as being non-coding sequences. By testing the re-prediction program 10 times on data sets composed of the function-known genes, the mean accuracy of 99.99% and mean Matthews correlation coefficient value of 0.9999 were obtained. Further sequence analysis and COG analysis showed that the re-annotation results were very reliable. This work can provide an efficient tool and data resources for future studies of A. tumefaciens strain C58.
中图分类号: (Nucleic acids, DNA and RNA bases?)