python - Python 中的快速探路者关联网络算法 (PFNET)-6ren

python - Python 中的快速探路者关联网络算法 (PFNET)

转载作者：行者123 更新时间：2023-12-05 04:40:31

我一直在尝试实现来自 https://doi.org/10.1016/j.ipm.2007.09.005 的“快速探路者”网络修剪算法在 Python/networkX 中，最终偶然发现了一些返回看起来或多或少正确的东西。

不过，我没有足够的能力来测试结果是否始终如一(或永远)正确。特别是对于有向图，我有疑问，而且我不确定原件是否打算用于有向图。我还没有找到任何探路者网络算法的 Python 实现，但如果有现有的替代方案可供使用，我也会对比较结果感兴趣。我知道 R ( https://rdrr.io/cran/comato/src/R/pathfinder.r) 中有一个实现，我也从中获得了一些灵感。

根据我最好的(阅读:较差的)理解，论文中描述的算法使用由 Floyd-Warshall 算法生成的最短路径的距离矩阵，并将这些距离与加权邻接矩阵进行比较，仅选择匹配项作为链接。无向情况下预期结果的直觉是所有可能的最小生成树中所有边的并集。

这就是我试图用以下函数模拟的内容:

def minimal_pathfinder(G, r = float("inf")):
    """ 
    Args:
    -----
    G [networkX graph]:
        Graph to filter links from.
    r [float]:
        "r" parameter as in the paper.

    Returns:
    -----
    PFNET [networkX graph]:
        Graph containing only the PFNET links.
    """
    
    import networkx as nx
    from collections import defaultdict
    
    H = G.copy()
    
    # Initialize adjacency matrix W
    W = defaultdict(lambda: defaultdict(lambda: float("inf")))
    
    # Set diagonal to 0
    for u in H.nodes():
        W[u][u] = 0 
    
    # Get weights and set W values
    for i, j, d in H.edges(data=True):
        W[i][j] = d['weight'] # Add weights to W
        
    # Get shortest path distance matrix D
    dist = nx.floyd_warshall_predecessor_and_distance(H, weight='weight')[1]
    
    # Iterate over all triples to get values for D
    for k in H.nodes():
        for i in H.nodes():
            for j in H.nodes():
                if r == float("inf"): # adapted from the R-comato version which does a similar check
                # Discard non-shortest paths
                    dist[i][j] = min(dist[i][j], (dist[i][k] + dist[k][j]))
                else:
                    dist[i][j] = min(dist[i][j], (((dist[i][k]) ** r) + ((dist[k][j]) ** r )) ** (1/r))
                
    # Check for type; set placeholder for either case
    if not H.is_directed():
        PFNET = nx.Graph()
        PFNET.add_nodes_from(H.nodes(data=True))
    else:
        PFNET = nx.DiGraph()
        PFNET.add_nodes_from(H.nodes(data=True))
        
    # Add links D_ij only if == W_ij
    for i in H.nodes():
        for j in H.nodes():
            if dist[i][j] == W[i][j]: # If shortest path distance equals distance in adjacency
                if dist[i][j] == float("inf"): # Skip infinite path lengths
                    pass
                elif i == j: # Skip the diagonal
                    pass
                else: # Add link to PFNET
                    weight = dist[i][j]
                    PFNET.add_edge(i, j, weight=weight)
                    
    return PFNET

我已经用一堆真实网络(有向和无向)和随机生成的网络对此进行了测试，这两种情况都从 20 个节点到大约 300 个节点不等，最多几千条边(例如完整图、连接的穴居人图) .在所有情况下，它都会返回一些东西，但我不太相信结果是否正确。因为我没有找到其他实现，所以我不确定如何验证它是否始终如一地工作(我实际上根本没有使用任何其他语言)。

我相当确定这仍然有问题，但我不确定它可能是什么。

简单用例:

G = nx.complete_graph(50) # Generate a complete graph

# Add random weights
for (u,v,w) in G.edges(data=True):
    w['weight'] = np.random.randint(1,20)
    
PFNET = minimal_pathfinder(G)

print(nx.info(G))
print(nx.info(PFNET))

输出:

Graph with 50 nodes and 1225 edges
Graph with 50 nodes and 236 edges

我想知道两件事:

<强>1。知道实现可能有什么问题吗？我应该对结果有信心吗？

知道如何将其转换为使用相似性数据而不是距离吗？

对于第二个，我考虑将权重归一化到 0-1 范围，并将所有距离转换为 1 - 距离的相似性。但我不确定这在理论上是否有效，希望得到第二意见。

编辑:我可能发现了 Q2 的解决方案。在原始论文中:将 float("inf") 更改为 float("-inf") 并将 min 更改为 max 在第一个循环中。来自作者的脚注:

Actually, using similarities or distances has no influence at all inour proposal. In case of using similarities, we would only need toreplace MIN by MAX, ’>’ by ’<’, and use r = -inf to mimic the MINfunction instead of the MAX function in the Fast Pathfinder algorithm.

非常感谢任何输入，谢谢!

使用“来自数据文件的示例”部分，根据评论编辑(从 here 添加错误示例):

起始图中的邻接:

matrix([[0, 1, 4, 2, 2],
        [1, 0, 2, 3, 0],
        [4, 2, 0, 3, 1],
        [2, 3, 3, 0, 3],
        [2, 0, 1, 3, 0]], dtype=int32)

然后用函数剪枝后，首先转换成networkX无向图:

matrix([[0, 1, 0, 2, 2],
        [1, 0, 2, 3, 0],
        [0, 2, 0, 3, 1],
        [2, 3, 3, 0, 3],
        [2, 0, 1, 3, 0]], dtype=int32)

它似乎只掉落了所有其他边缘的最高权重。由于预期结果在链接示例的边缘列表中，因此这也是我获得的结果的边缘列表:

source  target  weight
1       2       1
1       4       2
1       5       2
2       3       2
2       4       3 
3       4       3
3       5       1
4       5       3

最佳答案

下面是 Fast-Pathfinder 在 Python 中使用 networkx 库的可能实现。注意:

实现对应于paper .
它的灵感来自于 GitHub 中的 C 实现。 .
仅实现最大变体，其中输入矩阵是相似度矩阵而不是距离矩阵(具有最高值的边被保留)。

def fast_pfnet(G, q, r):
    
    s = G.number_of_nodes()
    weights_init = np.zeros((s,s))
    weights = np.zeros((s,s))
    hops = np.zeros((s,s))
    pfnet = np.zeros((s,s))

    for i, j, d in G.edges(data=True):
        weights_init[i,j] = d['weight']
        weights_init[j,i] = d['weight']

    for i in range(s):
        for j in range(s):
            weights[i,j] = -weights_init[i,j]
            if i==j:
                hops[i,j] = 0
            else:
                hops[i,j] = 1

    def update_weight_maximum(i, j, k, wik, wkj, weights, hops, p):
        if p<=q:
            if r==0:
                # r == infinity
                dist = max(wik, wkj)
            else:
                dist = (wik**r + wkj**r) ** (1/r)

            if dist < weights[i,j]:
                weights[i,j] = dist
                weights[j,i] = dist
                hops[i,j] = p
                hops[j,i] = p
                
    def is_equal(a, b):
        return abs(a-b)<0.00001

    for k in range(s):
        for i in range(s):
            if i!=k:
                beg = i+1
                for j in range(beg, s):
                    if j!=k:
                        update_weight_maximum(i, j, k, weights_init[i,k], weights_init[k,j], weights, hops, 2)
                        update_weight_maximum(i, j, k, weights[i,k], weights[k,j], weights, hops, hops[i,k]+hops[k,j])

    for i in range(s):
        for j in range(s): # Possible optimisation: in case of symmetrical matrices, we do not need to go from 0 to s but from i+1 to s
            if not is_equal(weights_init[i,j], 0):
                if is_equal(weights[i,j], -weights_init[i,j]):
                    pfnet[i,j] = weights_init[i,j]
                else:
                    pfnet[i,j] = 0

    return nx.from_numpy_matrix(pfnet)

用法:

m = np.matrix([[0, 1, 4, 2, 2],
        [1, 0, 2, 3, 0],
        [4, 2, 0, 3, 1],
        [2, 3, 3, 0, 3],
        [2, 0, 1, 3, 0]], dtype=np.int32)

G = nx.from_numpy_matrix(m)

# Fast-PFNET parameters set to emulate MST-PFNET
# This variant is OK for other parameters (q, r) but for the ones below
# it is faster to implement the MST-PFNET variant instead.
q = G.number_of_nodes()-1
r = 0

P = fast_pfnet(G, q, r)

list(P.edges(data=True))

这应该返回:

[(0, 2, {'weight': 4.0}),
 (1, 3, {'weight': 3.0}),
 (2, 3, {'weight': 3.0}),
 (3, 4, {'weight': 3.0})]

这类似于 website 上显示的内容(见“Pathfinder 应用后”一节中的示例)。

关于python - Python 中的快速探路者关联网络算法 (PFNET)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/70262806/

文章推荐： javascript - Jsplumb 分离连接

快速/更改循环中变量的名称
如何更改循环中变量的名称？比如 number1 、 number2 、 number3 、 number4 ？ var array = [2,4,6,8] func ap ( number1: Int
iOS延迟更改背景颜色(快速)
我想设置 View 的背景颜色并在一定延迟后将其更改为另一种颜色。这是我的尝试方式: print("setting color 1") self.view.backgroundColor = UICo
快速 session 在请求之间不持久
我在使用 express-session 时遇到问题。 session 数据不会在请求之间持续存在。正如您在下面的代码中看到的那样，/join 路由设置了一些 session 属性，但是当 /sur
快速 Steam 叶环
我试图从叶渲染器获得一个非常简单的结果，用于快速 Steam 的 for 循环。我正在上传叶文件 HTML，因为它不接受此处格式正确的代码 - 下面的pizza.swift代码- import
performance - 快速、简单的程序员编辑器
你们中有人有什么好的链接可以与我分享吗？我正在寻找一个 FAST 程序员编辑器，它可以非常快速地打开包含超过 100, 000 行代码的文件？我目前正在使用记事本自动取款机，打开一个 29000 行长
r - 快速、高效地循环数百万行并匹配列
我现在正在处理眼动追踪数据，因此拥有一个巨大的数据集(想想数百万行)，因此希望有一种快速的方法来完成此任务。这是它的简化版本。数据告诉您眼睛在每个时间点正在查看的位置以及我们正在查看的每个文件。 X
ios - 选择提示音-快速
我是新手，想为计时器或其他设备选择提示音。如何打开此列表，以选择其中一种声音？ Alert sound list 最佳答案您将无法在应用中使用系统声音。但是，您可以包括自己的声音文件，并将其显示
arrays - 将顺序字符串构建到数组中(快速)
我编写了以下代码来构建具有顺序字符串的数组。它的工作方式与我预期的一样，但我希望它能更快地运行。有没有更有效的方法在PowerShell中产生我想要的结果？我是PowerShell的新手，非常感谢
r - 快速、简洁地生成唯一矩阵行的有序频率计数的方法
我有一个包含一些非唯一行的矩阵，例如: x 尝试 y <- rle(apply(x, 1, paste, collapse = " ")) # y$lengths is the vector con
ios - 键盘打开时移动菜单(快速)
我的函数“keyboardWillShown”有问题。所以我想要的是菜单打开时，菜单正好出现在键盘上方。它可以在Iphone 8 plus，8、7、6上完美运行。但是，当我在模拟器上运行Iphone
ios - 第二次API调用后应用崩溃(快速)
我正在尝试通过Swift 5中的HTTP get方法从API提取数据。它在启动时成功加载了数据，但是当我刷新页面时，它说“索引超出范围”，这是因为数据是不再会在我的日志中读取，因此索引中没有任何内容。
ios - 将时间戳转换为其他时区(快速)
我想做什么: 从我的数据库中获取时间戳并将其转换为用户的时区。我的代码: let tryItNow = "\(model.timestampName)" let format = D
ios - 查找字符串的宽度(快速)
给定字体名称和字体大小，如何查找字符串的宽度(CGFloat)？ (目标是将UIView的宽度设置为足以容纳字符串的宽度。) 我有两个字符串:一个重复“1”，重复36次，另一个重复“M”，重复36次。
ios - JSON解析(快速)
我正在尝试解析此JSON ["Items": ( { AccountBalance = 0; AlphabetType = 3; Description = "\U0631\U
ios - 根据自动布局更改UILabel中的字体大小(快速)
我在UINavigationBar内放置了一个UILabel。我想根据navigationBar的高度增加该标签的字体大小。当navigationBar很大时，我希望字体大小更大；当滚动并缩小nav
ios - 消除数字中的多个小数点(快速)
我想将用户输入限制为仅有效数字并使用以下内容: func textView(_ textView: UITextView, shouldChangeTextIn range: NSRange, rep
C# - 图像比较(快速)
目前我有一个包含超过 100.000 张图像的数据库，它们大小不一或类似，但我想为我的公司制作以下内容: 我插入/上传一张图片，系统返回最有可能相同的图片。我不知道使用什么算法，但它需要快速。我可以预
ios - 按下按钮时发生的操作 - 快速
在我的 swift 项目中，我有一个按钮，我想在标签上打印按下该按钮的时间。如何解决这个问题？最佳答案添加到DHEERAJ的答案中，您只需在func press(sender: UIButton
arrays - 我想从解析加载数组数据(快速)
我必须发表评论，尝试在解析中导入数组。然而，有一个问题。当我尝试从 Parse 加载数组时，我的输出是 ("Blah","Blah","Blah")这是一个元组...而不是一个数组 TT... 如何
swift - 简化嵌套 if 快速
我的应用程序有一个名为 MyDevice 的类，我用它来与硬件通信。该硬件是可选的，实例变量也是可选的: var theDevice:MyDevice = nil 然后，在应用程序中，我必须初始化设备

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - Python 中的快速探路者关联网络算法 (PFNET)