中国物理B ›› 2013, Vol. 22 ›› Issue (10): 108903-108903.doi: 10.1088/1674-1056/22/10/108903

• INTERDISCIPLINARY PHYSICS AND RELATED AREAS OF SCIENCE AND TECHNOLOGY • 上一篇    下一篇

An improvement of the fast uncovering community algorithm

王莉a b, 王将a b, 沈华伟b, 程学旗b   

  1. a College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030024, China;
    b Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
  • 收稿日期:2013-05-14 修回日期:2013-06-09 出版日期:2013-08-30 发布日期:2013-08-30
  • 基金资助:
    Project supported by the Major State Basic Research Development Program of China (Grant Nos. 2013CB329602 and 2012CB316303), the Science Research Foundation for the Returned Overseas Chinese Scholars, China (Grant No. 2010-31), the International Collaborative Project of Shanxi Province, China (Grant No. 2011081034), and the National Natural Science Foundation of China (Grant Nos. 61232010 and 61202215).

An improvement of the fast uncovering community algorithm

Wang Li (王莉)a b, Wang Jiang (王将)a b, Shen Hua-Wei (沈华伟)b, Cheng Xue-Qi (程学旗)b   

  1. a College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030024, China;
    b Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2013-05-14 Revised:2013-06-09 Online:2013-08-30 Published:2013-08-30
  • Contact: Wang Li E-mail:wangli2011@ict.ac.cn
  • Supported by:
    Project supported by the Major State Basic Research Development Program of China (Grant Nos. 2013CB329602 and 2012CB316303), the Science Research Foundation for the Returned Overseas Chinese Scholars, China (Grant No. 2010-31), the International Collaborative Project of Shanxi Province, China (Grant No. 2011081034), and the National Natural Science Foundation of China (Grant Nos. 61232010 and 61202215).

摘要: Community detection methods have been used in computer, sociology, physics, biology, and brain information science areas. Many methods are based on the optimization of modularity. The algorithm proposed by Blondel et al. (Blondel V D, Guillaume J L, Lambiotte R and Lefebvre E 2008 J. Stat. Mech. 10 10008) is one of the most widely used methods because of its good performance, especially in the big data era. In this paper we make some improvements to this algorithm in correctness and performance. By tests we see that different node orders bring different performances and different community structures. We find some node swings in different communities that influence the performance. So we design some strategies on the sweeping order of node to reduce the computing cost made by repetition swing. We introduce a new concept of overlapping degree (OV) that shows the strength of connection between nodes. Three improvement strategies are proposed that are based on constant OV, adaptive OV, and adaptive weighted OV, respectively. Experiments on synthetic datasets and real datasets are made, showing that our improved strategies can improve the performance and correctness.

关键词: community division, algorithm improvement, performance

Abstract: Community detection methods have been used in computer, sociology, physics, biology, and brain information science areas. Many methods are based on the optimization of modularity. The algorithm proposed by Blondel et al. (Blondel V D, Guillaume J L, Lambiotte R and Lefebvre E 2008 J. Stat. Mech. 10 10008) is one of the most widely used methods because of its good performance, especially in the big data era. In this paper we make some improvements to this algorithm in correctness and performance. By tests we see that different node orders bring different performances and different community structures. We find some node swings in different communities that influence the performance. So we design some strategies on the sweeping order of node to reduce the computing cost made by repetition swing. We introduce a new concept of overlapping degree (OV) that shows the strength of connection between nodes. Three improvement strategies are proposed that are based on constant OV, adaptive OV, and adaptive weighted OV, respectively. Experiments on synthetic datasets and real datasets are made, showing that our improved strategies can improve the performance and correctness.

Key words: community division, algorithm improvement, performance

中图分类号:  (Structures and organization in complex systems)

  • 89.75.Fb
89.75.Hc (Networks and genealogical trees)