gpt4 book ai didi

python - 在 numpy 数组求和中将 nan 视为零,除了所有数组中的 nan

转载 作者:太空狗 更新时间:2023-10-30 00:43:11 30 4
gpt4 key购买 nike

我有两个 numpy 数组 NS,EW 来总结。他们每个人在不同的位置都有缺失值,比如

NS = array([[  1.,   2.,  nan],
[ 4., 5., nan],
[ 6., nan, nan]])
EW = array([[ 1., 2., nan],
[ 4., nan, nan],
[ 6., nan, 9.]]

我如何以 numpy 方式执行求和运算,如果一个数组在某个位置有 nan,它将 nan 视为零,如果两个数组在同一位置都有 nan,则保留 nan。

我期望看到的结果是

SUM = array([[  2.,   4.,  nan],
[ 8., 5., nan],
[ 12., nan, 9.]])

当我尝试

SUM=np.add(NS,EW)

它给了我

SUM=array([[  2.,   4.,  nan],
[ 8., nan, nan],
[ 12., nan, nan]])

当我尝试

SUM = np.nansum(np.dstack((NS,EW)),2)

它给了我

SUM=array([[  2.,   4.,   0.],
[ 8., 5., 0.],
[ 12., 0., 9.]])

当然可以通过元素级的操作来实现我的目的,

for i in range(np.size(NS,0)):
for j in range(np.size(NS,1)):
if np.isnan(NS[i,j]) and np.isnan(EW[i,j]):
SUM[i,j] = np.nan
elif np.isnan(NS[i,j]):
SUM[i,j] = EW[i,j]
elif np.isnan(EW[i,j]):
SUM[i,j] = NS[i,j]
else:
SUM[i,j] = NS[i,j]+EW[i,j]

但是速度很慢。所以我正在寻找一个更 NumPy 的解决方案来解决这个问题。

提前感谢您的帮助!

最佳答案

方法 #1: 一种使用 np.where 的方法-

def sum_nan_arrays(a,b):
ma = np.isnan(a)
mb = np.isnan(b)
return np.where(ma&mb, np.nan, np.where(ma,0,a) + np.where(mb,0,b))

sample 运行-

In [43]: NS
Out[43]:
array([[ 1., 2., nan],
[ 4., 5., nan],
[ 6., nan, nan]])

In [44]: EW
Out[44]:
array([[ 1., 2., nan],
[ 4., nan, nan],
[ 6., nan, 9.]])

In [45]: sum_nan_arrays(NS, EW)
Out[45]:
array([[ 2., 4., nan],
[ 8., 5., nan],
[ 12., nan, 9.]])

方法 #2: 可能是一种更快的混合 boolean-indexing -

def sum_nan_arrays_v2(a,b):
ma = np.isnan(a)
mb = np.isnan(b)
m_keep_a = ~ma & mb
m_keep_b = ma & ~mb
out = a + b
out[m_keep_a] = a[m_keep_a]
out[m_keep_b] = b[m_keep_b]
return out

运行时测试-

In [140]: # Setup input arrays with 4/9 ratio of NaNs (same as in the question)
...: a = np.random.rand(3000,3000)
...: b = np.random.rand(3000,3000)
...: a.ravel()[np.random.choice(range(a.size), size=4000000, replace=0)] = np.nan
...: b.ravel()[np.random.choice(range(b.size), size=4000000, replace=0)] = np.nan
...:

In [141]: np.nanmax(np.abs(sum_nan_arrays(a, b) - sum_nan_arrays_v2(a, b))) # Verify
Out[141]: 0.0

In [142]: %timeit sum_nan_arrays(a, b)
10 loops, best of 3: 141 ms per loop

In [143]: %timeit sum_nan_arrays_v2(a, b)
10 loops, best of 3: 177 ms per loop

In [144]: # Setup input arrays with lesser NaNs
...: a = np.random.rand(3000,3000)
...: b = np.random.rand(3000,3000)
...: a.ravel()[np.random.choice(range(a.size), size=4000, replace=0)] = np.nan
...: b.ravel()[np.random.choice(range(b.size), size=4000, replace=0)] = np.nan
...:

In [145]: np.nanmax(np.abs(sum_nan_arrays(a, b) - sum_nan_arrays_v2(a, b))) # Verify
Out[145]: 0.0

In [146]: %timeit sum_nan_arrays(a, b)
10 loops, best of 3: 69.6 ms per loop

In [147]: %timeit sum_nan_arrays_v2(a, b)
10 loops, best of 3: 38 ms per loop

关于python - 在 numpy 数组求和中将 nan 视为零,除了所有数组中的 nan,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42209838/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com