| INTERDISCIPLINARY PHYSICS AND RELATED AREAS OF SCIENCE AND TECHNOLOGY |
Prev
Next
|
|
|
Extracting fuzzy clusters from massive attributed graphs using Markov lumpability optimization |
| Kai-Yue Jiang(蒋凯悦)1,†, Li-Heng Xu(徐力恒)1,†, Shi-Pei Lin(林诗佩)1,†, Li-Yang Zhou(周李阳)1,†, Hui-Jia Li(李慧嘉)1,‡, and Ge Gao(高歌)2,§ |
1 School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300074, China; 2 Management School, Beijing Sport University, Beijing 100084, China |
|
|
|
|
Abstract Attributed graph clustering plays a vital role in uncovering hidden network structures, but it presents significant challenges. In recent years, various models have been proposed to identify meaningful clusters by integrating both structural and attribute-based information. However, these models often emphasize node proximities without adequately balancing the efficiency of clustering based on both structural and attribute data. Furthermore, they tend to neglect the critical fuzzy information inherent in attributed graph clusters. To address these issues, we introduce a new framework, Markov lumpability optimization, for efficient clustering of large-scale attributed graphs. Specifically, we define a lumped Markov chain on an attribute-augmented graph and introduce a new metric, Markov lumpability, to quantify the differences between the original and lumped Markov transition probability matrices. To minimize this measure, we propose a conjugate gradient projection-based approach that ensures the partitioning closely aligns with the intrinsic structure of fuzzy clusters through conditional optimization. Extensive experiments on both synthetic and real-world datasets demonstrate the superior performance of the proposed framework compared to existing clustering algorithms. This framework has many potential applications, including dynamic community analysis of social networks, user profiling in recommendation systems, functional module identification in biological molecular networks, and financial risk control, offering a new paradigm for mining complex patterns in high-dimensional attributed graph data.
|
Received: 07 March 2025
Revised: 11 May 2025
Accepted manuscript online: 20 May 2025
|
|
PACS:
|
89.75.Da
|
(Systems obeying scaling laws)
|
| |
02.50.Ey
|
(Stochastic processes)
|
| |
07.05.Tp
|
(Computer modeling and simulation)
|
| |
05.40.-a
|
(Fluctuation phenomena, random processes, noise, and Brownian motion)
|
|
| Fund: This work is supported by the National Natural Science Foundation of China (Grant No. 72571150) and Beijing Natural Science Foundation (Grant No. 9182015). |
Corresponding Authors:
Hui-Jia Li, Ge Gao
E-mail: hjli@nankai.edu.cn;gaoge@bsu.edu.cn
|
Cite this article:
Kai-Yue Jiang(蒋凯悦), Li-Heng Xu(徐力恒), Shi-Pei Lin(林诗佩), Li-Yang Zhou(周李阳), Hui-Jia Li(李慧嘉), and Ge Gao(高歌) Extracting fuzzy clusters from massive attributed graphs using Markov lumpability optimization 2025 Chin. Phys. B 34 108903
|
[1] Xiao C, Xu X, Lei Y, Zhang K, Liu S and Zhou F 2023 IEEE Trans. Knowl. Data Eng. 35 10540 [2] Qin M, Zhang C, Bai B, Zhang G and Yeung D Y 2023 ACM Trans. Knowl. Discov. Data 17 1 [3] Qiu R, Yin H, Huang Z and Chen T 2020 Proc. 43rd Int. ACM SIGIR Conf. Res. Dev. Inf. Retr. 669 [4] Kargar M, Zihayat M and Szlichta J 2019 Proc. 29th Annu. Int. Conf- Comput. Sci. Sofw. Eng. 19 397 [5] Albert R and Barabasi A L 2002 Rev. Mod. Phys. 74 47 [6] Yang Y, Su X, Zhao B, Li G, Hu P, Zhang J and Hu L 2024 IEEE Trans. Fuzzy Syst. 32 1951 [7] Li H J, Feng Y, Xia C and Cao J 2024 ACM Trans. Knowl. Discov. Data 18 1 [8] Li H J, Cao H, Feng Y, Li X and Pei J 2024 IEEE Trans. Knowl. Data Eng. 36 11 [9] He C, Liu S, Zhang L and Zheng J 2019 J. Ambient Intell. Humaniz. Comput. 10 3399 [10] Xie X X, Huo L A, Dong Y F and Cheng Y Y 2024 Chin. Phys. B 33 038704 [11] Hu X, Chen J X and Cheng Y 2024 Chin. Phys. B 33 100202 [12] Qiu L, Jia W, Yu J, Fan X and Gao W 2019 IEEE Access 7 62511 [13] Tu W, Liu X, Zhou S, Li M, Zhu E, Liu L, Zhang C and Yin J 2021 Proc AAAI Conf. Artif. Intell. 35 9978 [14] Yang Y, Su X, Zhao B, Li G, Peng W, Zhang J and Hu L 2023 IEEE Trans. Fuzzy Syst. 32 4 [15] Pons P and Latapy M 2006 J. Graph Algorithms Appl. 10 191 [16] Hu L, Yang Y, Tang Z, He Y and Luo X 2023 IEEE Trans. Fuzzy Syst. 31 10 [17] Ahmed M, Seraj R and Islam S M S 2020 Electronics 9 1295 [18] Grohe M and Schweitzer P 2020 Commun. ACM 63 128 [19] Ikotun A M, Ezugwu A E, Abualigah L, Abuhaija B and Heming J 2023 Inf. Sci. 622 178 [20] Talaei Khoei T, Ould Slimane H and Kaabouch N 2023 Neural Comput. Appl. 35 23103 [21] Mohammed A and Kora R 2023 J. King Saud Univ. - Comput. Inf. Sci. 35 757 [22] Anwar S, Azeem M, Jamil M K, Almohsen B and Shang Y 2024 J. Supercomput. 80 19976 [23] Huo L A and Yu Y 2023 Chin. Phys. B 32 108703 [24] Gretton A, Bousquet O, Smola A and Scholkopf B 2005 Int. Conf. Algorithmic Learn. Theory 63 [25] Nazareth J L 2009 Wiley Interdiscip. Rev.: Comput. Stat. 80 19976 [26] Hashem I, Alaba F, Jumare M, Ibrahim A and Abulfaraj A 2024 IEEE Access 12 33757 [27] E W, Li T and Vanden-Eijnden E 2008 Proc. Natl. Acad. Sci. USA 105 7907 [28] Zhou S, Liu X, Li M, Zhu E, Liu L, Zhang C and Yin J 2019 IEEE Trans. Neural Netw. Learn. Syst. 31 1351 [29] Wang S, Liu X, Liu L, Zhou S and Zhu E 2023 IEEE Trans. Neural Netw. Learn. Syst. 34 4359 [30] Wang S, Liu X, Zhu X, Zhang P, Zhang Y, Gao F and Zhu E 2021 IEEE Trans. Image Process. 31 556 [31] Berahmand K, Mohammadi M, Saberi-Movahed F, Li Y and Xu Y 2023 IEEE Trans. Netw. Sci. Eng. 10 372 [32] Rohan H R K and Sahadath M H 2023 Prog. Nucl. Energy 164 104849 [33] Cui G, Zhou J, Yang C and Liu Z 2020 Proc. 26th ACM SIGKDD Int. Conf. Knowl. Discov. & Data Mining 976 [34] Li Y, Wang P, Li Z, Yu J X and Li J 2024 Proc. 30th ACM SIGKDD Conf. Knowl. Discov. Data Min. 1725 [35] Temsah M H, Altamimi I, Jamal A, Alhasan K and Al-Eyadhy 2023 Cureus 15 9 [36] Peng Z, Liu H, Jia Y and Hou J 2022 IEEE Trans. Circuits Syst. Video Technol. 33 3296 [37] Yang X, Liu Y, Zhou S, Wang S, Tu W, Zheng Q, Liu X, Fang L and Zhu E 2023 Proc. AAAI Conf. Artif. Intell. 37 10834 [38] Cunha W, Viegas F, França C, Rosa T, Rocha L and Gonçalves M A 2023 ACM Comput. Surv. 55 1 [39] Liu Y, Tu W, Zhou S, Liu X, Song L, Yang X and Zhu E 2022 AAAI Conf. Artif. Intell. 2022 [40] Wu X, Hu J, Zhang A, Quan Y, Miao Q and Sun P G 2024 arXiv:2408.09790 [41] Li H J, Song S, Tan W, Huang Z, Li X, Xu W and Cao J 2022 Neurocomputing 512 482 |
| No Suggested Reading articles found! |
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
Altmetric
|
|
blogs
Facebook pages
Wikipedia page
Google+ users
|
Online attention
Altmetric calculates a score based on the online attention an article receives. Each coloured thread in the circle represents a different type of online attention. The number in the centre is the Altmetric score. Social media and mainstream news media are the main sources that calculate the score. Reference managers such as Mendeley are also tracked but do not contribute to the score. Older articles often score higher because they have had more time to get noticed. To account for this, Altmetric has included the context data for other articles of a similar age.
View more on Altmetrics
|
|
|