python - 为什么 NetworkX 中的 Girvan-Newman 算法这么慢-6ren

python - 为什么 NetworkX 中的 Girvan-Newman 算法这么慢

转载作者：行者123 更新时间：2023-12-04 00:55:21

26

4

我有大约。我的 graphml 文件中有 4000 个节点和 6000 个边，将其转换为 networkx 的有向图格式没有问题。但是，当我尝试从 networkx 运行 girvan_newman() 时，它似乎卡住了，因为我已经运行了脚本并且它在过去 10 个小时内还没有完成(我尝试了 10 个节点和边，它在 5分钟)。
这是我的片段:

import community as community_louvain
import networkx as nx
from networkx.algorithms.community.centrality import girvan_newman

G = nx.read_graphml('graph.graphml')
partition_girvan_newman = girvan_newman(G)
list(partition_girvan_newman)

我的问题是:

NetworkX 的 girvan_newman() 是否只接受无向图？

如果 networkx 中的 girvan-newman 确实能够处理这么多数据，我应该修改什么以使其更快？

最佳答案

Girvan–Newman algorithm在计算上非常昂贵。正如 docs 中提到的在 NetworkX 中:

The Girvan–Newman algorithm detects communities by progressively removing edges from the original graph. The algorithm removes the “most valuable” edge, traditionally the edge with the highest betweenness centrality, at each step.

通过查看源代码，调用如下:

while g.number_of_edges() > 0:
    yield _without_most_central_edges(g, most_valuable_edge)

依次调用:

while num_new_components <= original_num_components:
    edge = most_valuable_edge(G)
    G.remove_edge(*edge)
    new_components = tuple(nx.connected_components(G))
    num_new_components = len(new_components)
return new_components

所以在每一步，最有值(value)的边被移除，定义为具有最高介数中心性的边，并找到连接的组件。所以粗略地说，复杂度是边数乘以连通分量算法的复杂度和最高中介中心性的数量级。
docs提到一些对返回的生成器进行切片并保留第一个 k 的方法社区元组。如果您想将算法运行到 kth，这里有一个迭代:

from itertools import islice, takewhile

G = nx.fast_gnp_random_graph(10, 0.2)
k = 2
comp = girvan_newman(G)
for communities in islice(comp, k):
    print(tuple(sorted(c) for c in communities)) 
([0, 3, 4, 8], [1, 5], [2], [6, 7, 9])
([0, 3], [1, 5], [2], [4, 8], [6, 7, 9])

或者使用 itertools.takewhile 在社区数量超过某个阈值之前采用元组，这似乎是一种有趣的方法，因为它允许您强加所需的集群数量，例如:

G = nx.fast_gnp_random_graph(10, 0.3)
k = 4
comp = girvan_newman(G)
limited = takewhile(lambda c: len(c) <= k, comp)
for communities in limited:
    print(tuple(sorted(c) for c in communities)) 

([0, 1, 2, 3, 4, 5, 6, 7, 8], [9])
([0, 2, 4, 7, 8], [1, 3, 5, 6], [9])
([0, 2, 4, 7, 8], [1], [3, 5, 6], [9])

回答您的第一个问题，您将在源代码中看到该图已复制到无向图 g = G.copy().to_undirected() ，所以是的，它仅适用于无向图。

关于python - 为什么 NetworkX 中的 Girvan-Newman 算法这么慢，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/62951320/

26

4

0

文章推荐： selenium - 为什么 XPath 在 Chrome84 中不高亮黄色标记？

文章推荐： sql - 隔离特定字符串之前开始的字符串

python - 绘制 NetworkX Girvan-Newman 算法找到的社区的树状图
用于网络社区检测的 Girvan-Newman 算法: detects communities by progressively removing edges from the original gr
python - 为什么 NetworkX 中的 Girvan-Newman 算法这么慢
我有大约。我的 graphml 文件中有 4000 个节点和 6000 个边，将其转换为 networkx 的有向图格式没有问题。但是，当我尝试从 networkx 运行 girvan_newman(
Javascript - Bron-Kerbosch、Girvan-Newman 算法(图中的最大集团/社区)
我正在寻找 Bron-Kerbosch algorithm 的 Javascript 实现或 Girvan-Newman algorithm . 基本上，我想在无向图中为最大集团/社区着色。遗憾的是
graph - 如何在 Gephi 0.9.1 中应用 Girvan Newman 和马尔可夫聚类算法？
我是 Gephi 的初学者，我想在我的图(节点-边)上应用 Gephi 0.9.1 中的 Girvan Newman 和马尔可夫聚类算法我从 gephi.org 下载了这些插件 https://mar

首页

博学

6Ren·AI

商城

python - 为什么 NetworkX 中的 Girvan-Newman 算法这么慢