Please wait a minute...
Chin. Phys. B, 2023, Vol. 32(10): 108702    DOI: 10.1088/1674-1056/acf03e
INTERDISCIPLINARY PHYSICS AND RELATED AREAS OF SCIENCE AND TECHNOLOGY Prev   Next  

Combination of density-clustering and supervised classification for event identification in single-molecule force spectroscopy data

Yongyi Yuan(袁泳怡)1,2, Jialun Liang(梁嘉伦)1,2,†, Chuang Tan(谭创)1,2, Xueying Yang(杨雪滢)1,2, Dongni Yang(杨东尼)1,2, and Jie Ma(马杰)1,2,‡
1 School of Physics, Sun Yat-sen University, Guangzhou 510275, China;
2 State Key Laboratory of Optoelectronic Materials and Technologies, Sun Yat-sen University, Guangzhou 510006, China
Abstract  Single-molecule force spectroscopy (SMFS) measurements of the dynamics of biomolecules typically require identifying massive events and states from large data sets, such as extracting rupture forces from force-extension curves (FECs) in pulling experiments and identifying states from extension-time trajectories (ETTs) in force-clamp experiments. The former is often accomplished manually and hence is time-consuming and laborious while the latter is always impeded by the presence of baseline drift. In this study, we attempt to accurately and automatically identify the events and states from SMFS experiments with a machine learning approach, which combines clustering and classification for event identification of SMFS (ACCESS). As demonstrated by analysis of a series of data sets, ACCESS can extract the rupture forces from FECs containing multiple unfolding steps and classify the rupture forces into the corresponding conformational transitions. Moreover, ACCESS successfully identifies the unfolded and folded states even though the ETTs display severe nonmonotonic baseline drift. Besides, ACCESS is straightforward in use as it requires only three easy-to-interpret parameters. As such, we anticipate that ACCESS will be a useful, easy-to-implement and high-performance tool for event and state identification across a range of single-molecule experiments.
Keywords:  single-molecule force spectroscopy      data analysis      density-based clustering      supervised classification  
Received:  01 July 2023      Revised:  28 July 2023      Accepted manuscript online:  15 August 2023
PACS:  87.80.Nj (Single-molecule techniques)  
  82.37.-j (Single molecule kinetics)  
  87.15.H- (Dynamics of biomolecules)  
  07.05.Kf (Data analysis: algorithms and implementation; data management)  
Fund: Project supported by the National Natural Science Foundation of China (Grant No. 12074445) and the Open Fund of the State Key Laboratory of Optoelectronic Materials and Technologies of Sun Yat-sen University (Grant No. OEMT-2022-ZTS-05).
Corresponding Authors:  Jialun Liang, Jie Ma     E-mail:  liangjlun3@mail.sysu.edu.cn;majie6@mail.sysu.edu.cn

Cite this article: 

Yongyi Yuan(袁泳怡), Jialun Liang(梁嘉伦), Chuang Tan(谭创), Xueying Yang(杨雪滢), Dongni Yang(杨东尼), and Jie Ma(马杰) Combination of density-clustering and supervised classification for event identification in single-molecule force spectroscopy data 2023 Chin. Phys. B 32 108702

[1] Neuman K C and Nagy A 2008 Nat. Methods 5 491
[2] Woodside M T and Block S M 2014 Annu. Rev. Biophys. 43 19
[3] Neupane K, Zhao M, Lyons A, Munshi S, Ileperuma S M, Ritchie D B, Hoffer N Q, Narayan A and Woodside M T 2021 Nat. Commun. 12 4749
[4] Woodside M T, Behnke-Parks W M, Larizadeh K, Travers K, Herschlag D and Block S M 2006 Proc. Natl. Acad. Sci. USA 103 6190
[5] Rief M, Gautel M, Oesterhelt F, Fernandez J M and Gaub H E 1997 Science 276 1109
[6] Bustamante C, Alexander L, Maciuba K and Kaiser C M 2020 Annu. Rev. Biochem. 89 443
[7] Guo Z, Hong H, Yuan G, Qian H, Li B, Cao Y, Wang W, Wu C X and Chen H 2020 Phys. Rev. Lett. 125 198101
[8] Lei H, Zhang J, Li Y, Wang X, Qin M, Wang W and Cao Y 2022 ACS Nano 16 15440
[9] Merkel R, Nassoy P, Leung A, Ritchie K and Evans E 1999 Nature 397 50
[10] Ott W, Jobst M A, Schoeler C, Gaub H E and Nash M A 2017 J. Struct. Biol. 197 3
[11] Synakewicz M, Bauer D, Rief M and Itzhaki L S 2019 Sci. Rep. 9 13820
[12] Liang L, Ma K, Wang Z, Janissen R and Yu Z 2021 Biophys. J. 120 3283
[13] Cnossen J P, Dulin D and Dekker N H 2014 Rev. Sci. Instrum. 85 103712
[14] Sitters G, Kamsma D, Thalhammer G, Ritsch-Marte M, Peterman E J G and Wuite G J L 2015 Nat. Methods 12 47
[15] Agarwal R and Duderstadt K E 2020 Nat. Commun. 11 4714
[16] Akbari E, Shahhosseini M, Robbins A, Poirier M G, Song J W and Castro C E 2022 Nat. Commun. 13 6800
[17] Janissen R, Eslami-Mossallam B, Artsimovitch I, Depken M and Dekker N H 2022 Cell Rep. 39 110749
[18] de Vlaminck I, Henighan T, Van Loenhout M, Pfeiffer A, Huijts J, Kerssemakers J W, Katan A, van Langen-Suurling A, van der Drift E, Wyman C and Dekker C 2011 Nano Lett. 11 5489
[19] Janissen R, Berghuis B A, Dulin D, Wink M, van Laar T and Dekker N H 2014 Nucleic Acids Res. 42 e137
[20] Popa I, Rivas-Pardo J A, Eckels E C, Echelman D J, Badilla C L, Valle-Orero J and Fernández J M 2016 J. Am. Chem. Soc. 138 10546
[21] Garai A, Zhang Y and Dudko O K 2014 J. Chem. Phys. 140 135101
[22] Dudko O K, Hummer G and Szabo A 2006 Phys. Rev. Lett. 96 108101
[23] Dudko O K, Hummer G and Szabo A 2008 Proc. Natl. Acad. Sci. USA 105 15755
[24] Lin Z, Gao X, Li S and Hu C 2021 Biochem. Biophys. Res. Commun. 556 59
[25] Partola K R and Lykotrafitis G 2016 J. Biomech. 49 1221
[26] Sandal M, Benedetti F, Brucale M, Gomez-Casado A and Samorí B 2009 Bioinformatics 25 1428
[27] Heenan P R and Perkins T T 2018 Biophys. J. 115 757
[28] Duanis-Assaf T, Razvag Y and Reches M 2019 Anal. Methods 11 4709
[29] Woodside M T, Anthony P C, Behnke-Parks W M, Larizadeh K, Herschlag D and Block S M 2006 Science 314 1001
[30] McKinney S A, Joo C and Ha T 2006 Biophys. J. 91 1941
[31] van de Meent J W, Bronson J E, Wiggins C H and Gonzalez R L J 2014 Biophys. J. 106 1327
[32] Bronson J E, Fei J, Hofman J M, Gonzalez R L J and Wiggins C H 2009 Biophys. J. 97 3196
[33] Hadzic M C A S, Börner R, König S L B, Kowerko D and Sigel R K O 2018 J. Phys. Chem. B 122 6134
[34] White D S, Goldschen-Ohm M P, Goldsmith R H and Chanda B 2020 eLife 9 e53357
[35] Berghuis B A, Köber M, van Laar T and Dekker N H 2016 Methods 105 90
[36] Carter A R, Seol Y and Perkins T T 2009 Biophys. J. 96 2926
[37] Perkins T T 2014 Annu. Rev. Biophys. 43 279
[38] Carter A R, King G M, Ulrich T A, Halsey W, Alchenberger D and Perkins T T 2007 Appl. Opt. 46 421
[39] Nugent-Glandorf L and Perkins T T 2004 Opt. Lett. 29 2611
[40] Lv Y, Ma T, Tang M, Cao J, Tian Y, Al-Dhelaan A and Al-Rodhaan M 2016 Neurocomputing 171 9
[41] Jain A K, Murty M N and Flynn P J 1999 ACM Comput. Surv. 31 264
[42] Sander J, Ester M, Kriegel H P and Xu X 1998 Data Min. Knowl. Discov. 2 169
[43] Akinsola J E T 2017 Int. J. Comput. Trends Technol. 48 128
[44] Zhang Y and Dudko O K 2013 Proc. Natl. Acad. Sci. USA 110 16432
[45] Smith S B, Cui Y and Bustamante C 1996 Science 271 795
[46] Sahay T, Aggarwal A, Bansal A and Chandra M 2015 2015 1st Int. Conf. Gener. Comput. Technol. NGCT, September 4-5, 2015, Dehradun, India, pp. 960-964
[1] A revised jump-diffusion and rotation-diffusion model
Hua Li(李华), Yu-Hang Chen(陈昱沆), Bin-Ze Tang(唐宾泽). Chin. Phys. B, 2019, 28(5): 056105.
[2] Reconstruction of dynamic structures of experimental setups based on measurable experimental data only
Tian-Yu Chen(陈天宇), Yang Chen(陈阳), Hu-Jiang Yang(杨胡江), Jing-Hua Xiao(肖井华), Gang Hu(胡岗). Chin. Phys. B, 2018, 27(3): 030503.
No Suggested Reading articles found!