python - 获取图中的连通分量-6ren

python - 获取图中的连通分量

转载作者：太空宇宙更新时间：2023-11-03 15:57:39

我正在尝试获取图表中的所有连接组件并将其打印出来。我将遍历图表的每个节点，并从该节点开始执行深度优先搜索(DFS)。这是我的代码:

graph = {
'a': ['b'],
'b': ['c'],
'c': ['d'],
'd': [],
'e': ['f'],
'f': []
}

def DFS(graph, start_node, stack = [], visited = []):
    stack.append(start_node)

    while stack:
        v = stack.pop()
        if v not in visited:
            visited.append(v)
            for neighbor in graph[v]:
                stack.append(neighbor)
    return visited



def dfs_get_connected_components_util(graph):
    visited = []

    for node in graph:
        if node not in visited:
            DFS_algo = DFS(graph, node)
            print(DFS_algo)
            visited = DFS_algo

print(dfs_get_connected_components_util(graph))

根据我的图表，有两个连接的组件，a -> b -> c -> d和 e -> f

相反，我得到以下打印结果:

['c', 'd']
['c', 'd', 'a', 'b']
['c', 'd', 'a', 'b', 'f']
['c', 'd', 'a', 'b', 'f', 'e']

我似乎无法弄清楚我在连接组件功能中做错了什么。我想这可能更像是一个Python问题。

最佳答案

这就是我想出的办法。我添加了一些内联评论来解释我所做的事情。为了清晰起见，有些内容被移至全局。我通常不建议使用全局变量。

关键是要理解递归，并且还要记住，在对对象(不是文字)进行赋值时，您只分配引用而不是它的副本。

请注意，此解决方案假设该图是无向的。请参阅下面的注释部分了解更多详细原因。

请随时要求澄清。

from collections import defaultdict

graph = {
    'a': ['b'],
    'b': ['c'],
    'c': ['d'],
    'd': [],
    'e': ['f'],
    'f': []
}

connected_components = defaultdict(set)


def dfs(node):
    """
    The key is understanding the recursion
    The recursive assumption is:
        After calling `dfs(node)`, the `connected_components` dict contains all the connected as keys,
        and the values are *the same set* that contains all the connected nodes.
    """
    global connected_components, graph
    if node not in connected_components:
        # this is important, so neighbors won't try to traverse current node
        connected_components[node] = set()
        for next_ in graph[node]:
            dfs(next_)
            # according the recursive assumption, connected_component of `next_` is also the one of `node`
            connected_components[node] = connected_components[next_]

        # all that's left is add the current node
        connected_components[node].add(node)

for node_ in graph:
    dfs(node_)


# get all connected components and convert to tuples, so they are hashable
connected_comp_as_tuples = map(tuple, connected_components.values())

# use ``set`` to make the list of connected components distinct (without repetition)
unique_components = set(connected_comp_as_tuples)
print(unique_components)

注释

当然，这还没有经过彻底测试...您应该尝试使用不同的图表(使用循环、单节点组件等)
代码可能还可以改进(在性能和清晰度方面)。例如，我们为每个节点创建一个集合，即使我们并不真正需要一个(当节点有邻居时，该集合是多余的并且将被覆盖)。
在OP的原始代码中，他使用了mutable default arguments 。这是一个很大的“不”(除非您真的知道自己在做什么)，并且如上所述，可能导致了问题。但这次不是……
考虑到 @kenny-ostroms 对这个问题的评论，关于定义的一个词(与 Python 无关):连通分量仅与无向图相关。对于有向图，该术语是强连通分量。概念是相同的 - 对于此类组件中的每 2 个节点(有向或无向)，这 2 个节点之间都有一条路径。因此，即使节点“b”可以从“a”到达，如果“a”不能从“b”到达(这种情况仅可能发生在有向图中)，“a”和“b”也不会共享连接的组件。我的解决方案对于有向图无效。该解决方案假设该图可以被视为无向(换句话说，如果“b”是“a”的邻居，我们假设“a”是“b”的邻居)。