gpt4 book ai didi

How can I construct numpy covariance matrix from database table (load using pandas)(如何从数据库表构造数字协方差矩阵(使用PANAS加载))

转载 作者:bug小助手 更新时间:2023-10-25 14:03:58 24 4
gpt4 key购买 nike



I have a pandas table I read from database and it contains covariance matrix (the number is random so that it is not positive semi-def). I would like a fast way to construct a numpy matrix from the pandas table.

我有一个我从数据库中读取的熊猫表格,它包含协方差矩阵(数字是随机的,所以它不是正半定的)。我想用一种快速的方法从熊猫的桌子上构建一个数字矩阵。


Pandas Table I have

我有一张熊猫桌子











































index1 index2 var
apple apple 1
apple orange 1
orange orange 0.5
lemon lemon 1.2
orange lemon -0.5
apple lemon -0.8


Expected result
[[1.2, -0.5, -0.8], [-0.5, 0.5, 1.0], [-0.8, 1.0, 1.0]]

预期结果[[1.2,-0.5,-0.8],[-0.5,0.5,1.0],[-0.8,1.0,1.0]]


Below is the sample code I tried, but it's not very fast.

下面是我尝试过的示例代码,但速度不是很快。


import numpy as np
import pandas as pd
pd_cov = pd.DataFrame([['apple', 'apple', 1], ['apple', 'orange', 1], ['orange', 'orange', 0.5], ['lemon', 'lemon', 1.2], ['orange', 'lemon', -0.5], ['apple', 'lemon', -0.8]], columns = ['index1', 'index2', 'var'])

def cov_obt(x,y):
try:
return(float(pd_cov_ind.loc[x, y]))
except:
return(float(pd_cov_ind.loc[y, x]))
ind = list(set(pd_cov['index1']))
pd_cov_ind = pd_cov.set_index(['index1', 'index2'])

np.array([[cov_obt(x,y) for y in ind] for x in ind])

更多回答
优秀答案推荐

Here's one approach:

这里有一种方法:


import pandas as pd
import numpy as np

m = pd_cov.pivot_table(index='index1', columns='index2',
sort=False, fill_value=0).to_numpy()

m = m + m.T - np.tril(m)

m

array([[ 1. , 1. , -0.8],
[ 1. , 0.5, -0.5],
[-0.8, -0.5, 1.2]])

Explanation

解释



  • Use df.pivot_table to pivot your data, with sort parameter set to False (maintaining order) and fill_value set to 0 (needed for step 2). Chain to_numpy and assign to variable m.

  • We now have a matrix (m) with the upper triangle filled as expected, and the lower triangle still filled with zeros (but for the diagonal). We can "copy" the values from the upper triangle by adding m and m.T (its transposed version). Since the diagonal will be doubled this way, as a final step, we need to substract the diagonal zeroed, which we can retrieve by applying np.tril.


更多回答

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com