gpt4 book ai didi

python - 绘制 NetworkX Girvan-Newman 算法找到的社区的树状图

转载 作者:行者123 更新时间:2023-12-03 23:09:12 25 4
gpt4 key购买 nike

用于网络社区检测的 Girvan-Newman 算法:

detects communities by progressively removing edges from the original graph. The algorithm removes the “most valuable” edge, traditionally the edge with the highest betweenness centrality, at each step. As the graph breaks down into pieces, the tightly knit community structure is exposed and the result can be depicted as a dendrogram.



在 NetworkX 中,实现返回集合元组上的迭代器。第一个元组是由 2 个社区组成的第一个切割,第二个元组是由 3 个社区组成的第二个切割,依此类推,直到最后一个元组具有 n 个单独节点(树状图的叶子)的 n 个集合。

import networkx as nx

G = nx.path_graph(10)
comp = nx.community.girvan_newman(G)
list(comp)

[({0, 1, 2, 3, 4}, {5, 6, 7, 8, 9}), ({0, 1}, {2, 3, 4}, {5, 6, 7, 8, 9}), ({0, 1}, {2, 3, 4}, {5, 6}, {8, 9, 7}), ({0, 1}, {2}, {3, 4}, {5, 6}, {8, 9, 7}), ({0, 1}, {2}, {3, 4}, {5, 6}, {7}, {8, 9}), ({0}, {1}, {2}, {3, 4}, {5, 6}, {7}, {8, 9}), ({0}, {1}, {2}, {3}, {4}, {5, 6}, {7}, {8, 9}), ({0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}, {8, 9}), ({0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}, {9})]



问题是:如何绘制这个树状图?

我看过 scipy.cluster.hierarchy.dendrogram 但它需要一个我猜的“链接矩阵”,例如由 scipy.cluster.hierarchy.linkage 创建的那个。 ,但我不确定如何将这个元组列表转换为这个“链接矩阵”。

所以我在问如何绘制这个树状图,有/没有 SciPy 的帮助 dendrogram .

最佳答案

在@ItamarMushkin 之后,我遵循了@mdml 的答案,并稍作修改并得到了我想要的。在高层次上,我将 NetworkX 的 Girvan-Newman 迭代器输出变成另一个 DiGraph()我最终希望看到一个树状图。然后我构建 Z ,一个链接矩阵 I 输入到 scipy.cluster.hierarchy.dendrogram , 以边列表的形式包含每个树状图合并的实际高度。

我必须对@mdml 的回答进行两项修改:

  • 没那么重要:我对进入 index 的节点的元组键进行排序
  • 更重要的是:我添加了 get_merge_height函数,它根据边缘去除的 Girvan-Newman 输出顺序为每个合并提供其独特的高度。否则,两个节点的所有合并在树状图中将具有相同的高度,在合并两个节点的下一级中的所有合并以及另一个节点将具有相同的高度等。

  • 我知道这里可能有一些冗余迭代,我还没有考虑优化。

    import networkx as nx
    from itertools import chain, combinations
    import matplotlib.pyplot as plt
    from scipy.cluster.hierarchy import dendrogram

    # get simulated Graph() and Girvan-Newman communities list
    G = nx.path_graph(10)
    communities = list(nx.community.girvan_newman(G))

    # building initial dict of node_id to each possible subset:
    node_id = 0
    init_node2community_dict = {node_id: communities[0][0].union(communities[0][1])}
    for comm in communities:
    for subset in list(comm):
    if subset not in init_node2community_dict.values():
    node_id += 1
    init_node2community_dict[node_id] = subset

    # turning this dictionary to the desired format in @mdml's answer
    node_id_to_children = {e: [] for e in init_node2community_dict.keys()}
    for node_id1, node_id2 in combinations(init_node2community_dict.keys(), 2):
    for node_id_parent, group in init_node2community_dict.items():
    if len(init_node2community_dict[node_id1].intersection(init_node2community_dict[node_id2])) == 0 and group == init_node2community_dict[node_id1].union(init_node2community_dict[node_id2]):
    node_id_to_children[node_id_parent].append(node_id1)
    node_id_to_children[node_id_parent].append(node_id2)

    # also recording node_labels dict for the correct label for dendrogram leaves
    node_labels = dict()
    for node_id, group in init_node2community_dict.items():
    if len(group) == 1:
    node_labels[node_id] = list(group)[0]
    else:
    node_labels[node_id] = ''

    # also needing a subset to rank dict to later know within all k-length merges which came first
    subset_rank_dict = dict()
    rank = 0
    for e in communities[::-1]:
    for p in list(e):
    if tuple(p) not in subset_rank_dict:
    subset_rank_dict[tuple(sorted(p))] = rank
    rank += 1
    subset_rank_dict[tuple(sorted(chain.from_iterable(communities[-1])))] = rank

    # my function to get a merge height so that it is unique (probably not that efficient)
    def get_merge_height(sub):
    sub_tuple = tuple(sorted([node_labels[i] for i in sub]))
    n = len(sub_tuple)
    other_same_len_merges = {k: v for k, v in subset_rank_dict.items() if len(k) == n}
    min_rank, max_rank = min(other_same_len_merges.values()), max(other_same_len_merges.values())
    range = (max_rank-min_rank) if max_rank > min_rank else 1
    return float(len(sub)) + 0.8 * (subset_rank_dict[sub_tuple] - min_rank) / range

    # finally using @mdml's magic, slightly modified:
    G = nx.DiGraph(node_id_to_children)
    nodes = G.nodes()
    leaves = set( n for n in nodes if G.out_degree(n) == 0 )
    inner_nodes = [ n for n in nodes if G.out_degree(n) > 0 ]

    # Compute the size of each subtree
    subtree = dict( (n, [n]) for n in leaves )
    for u in inner_nodes:
    children = set()
    node_list = list(node_id_to_children[u])
    while len(node_list) > 0:
    v = node_list.pop(0)
    children.add( v )
    node_list += node_id_to_children[v]
    subtree[u] = sorted(children & leaves)

    inner_nodes.sort(key=lambda n: len(subtree[n])) # <-- order inner nodes ascending by subtree size, root is last

    # Construct the linkage matrix
    leaves = sorted(leaves)
    index = dict( (tuple([n]), i) for i, n in enumerate(leaves) )
    Z = []
    k = len(leaves)
    for i, n in enumerate(inner_nodes):
    children = node_id_to_children[n]
    x = children[0]
    for y in children[1:]:
    z = tuple(sorted(subtree[x] + subtree[y]))
    i, j = index[tuple(sorted(subtree[x]))], index[tuple(sorted(subtree[y]))]
    Z.append([i, j, get_merge_height(subtree[n]), len(z)]) # <-- float is required by the dendrogram function
    index[z] = k
    subtree[z] = list(z)
    x = z
    k += 1

    # dendrogram
    plt.figure()
    dendrogram(Z, labels=[node_labels[node_id] for node_id in leaves])
    plt.savefig('dendrogram.png')

    enter image description here

    关于python - 绘制 NetworkX Girvan-Newman 算法找到的社区的树状图,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59821151/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com