gpt4 book ai didi

python - 具有 NaN 值或掩码的大型数组的双变量结构化插值

转载 作者:太空狗 更新时间:2023-10-29 21:59:54 31 4
gpt4 key购买 nike

我正在尝试定期插入网格 windstress使用 Scipy 的 RectBivariateSpline 的数据类(class)。在某些网格点,输入数据包含无效数据条目,这些条目被设置为 NaN 值。首先,我使用了 Scott's question 的解决方案关于二维插值。使用我的数据,插值返回一个仅包含 NaN 的数组。我还尝试了一种不同的方法,假设我的数据是非结构化的并使用 SmoothBivariateSpline类(class)。显然我有太多数据点无法使用非结构化插值,因为数据数组的形状是 (719 x 2880)。

为了说明我的问题,我创建了以下脚本:

from __future__ import division
import numpy
import pylab

from scipy import interpolate


# The signal and lots of noise
M, N = 20, 30 # The shape of the data array
y, x = numpy.mgrid[0:M+1, 0:N+1]
signal = -10 * numpy.cos(x / 50 + y / 10) / (y + 1)
noise = numpy.random.normal(size=(M+1, N+1))
z = signal + noise


# Some holes in my dataset
z[1:2, 0:2] = numpy.nan
z[1:2, 9:11] = numpy.nan
z[0:1, :12] = numpy.nan
z[10:12, 17:19] = numpy.nan


# Interpolation!
Y, X = numpy.mgrid[0.125:M:0.5, 0.125:N:0.5]
sp = interpolate.RectBivariateSpline(y[:, 0], x[0, :], z)
Z = sp(Y[:, 0], X[0, :])

sel = ~numpy.isnan(z)
esp = interpolate.SmoothBivariateSpline(y[sel], x[sel], z[sel], 0*z[sel]+5)
eZ = esp(Y[:, 0], X[0, :])


# Comparing the results
pylab.close('all')
pylab.ion()

bbox = dict(edgecolor='w', facecolor='w', alpha=0.9)
crange = numpy.arange(-15., 16., 1.)

fig = pylab.figure()
ax = fig.add_subplot(1, 3, 1)
ax.contourf(x, y, z, crange)
ax.set_title('Original')
ax.text(0.05, 0.98, 'a)', ha='left', va='top', transform=ax.transAxes,
bbox=bbox)

bx = fig.add_subplot(1, 3, 2, sharex=ax, sharey=ax)
bx.contourf(X, Y, Z, crange)
bx.set_title('Spline')
bx.text(0.05, 0.98, 'b)', ha='left', va='top', transform=bx.transAxes,
bbox=bbox)

cx = fig.add_subplot(1, 3, 3, sharex=ax, sharey=ax)
cx.contourf(X, Y, eZ, crange)
cx.set_title('Expected')
cx.text(0.05, 0.98, 'c)', ha='left', va='top', transform=cx.transAxes,
bbox=bbox)

给出以下结果:Bivariate gridding. (a) The original constructed data, (b) Scipy's RectBivariateSpline and (c) SmoothBivariateSpline classes.

图中显示了构建的数据图 (a) 以及使用 Scipy 的 RectBivariateSpline (b) 和 SmoothBivariateSpline (c) 类的结果。第一个插值结果是一个只有 NaN 的数组。理想情况下,我会期望得到类似于第二次插值的结果,后者的计算量更大。我不一定需要域区域之外的数据外推。

最佳答案

您可以使用 griddata,唯一的问题是它不能很好地处理边缘。这可以通过例如反射(reflect)来帮助,具体取决于您的数据如何......这是一个例子:

from __future__ import division
import numpy
import pylab
from scipy import interpolate


# The signal and lots of noise
M, N = 20, 30 # The shape of the data array
y, x = numpy.mgrid[0:M+1, 0:N+1]
signal = -10 * numpy.cos(x / 50 + y / 10) / (y + 1)
noise = numpy.random.normal(size=(M+1, N+1))
z = signal + noise


# Some holes in my dataset
z[1:2, 0:2] = numpy.nan
z[1:2, 9:11] = numpy.nan
z[0:1, :12] = numpy.nan
z[10:12, 17:19] = numpy.nan

zi = numpy.vstack((z[::-1,:],z))
zi = numpy.hstack((zi[:,::-1], zi))
y, x = numpy.mgrid[0:2*(M+1), 0:2*(N+1)]
y *= 5 # anisotropic interpolation if needed.

zi = interpolate.griddata((y[~numpy.isnan(zi)], x[~numpy.isnan(zi)]),
zi[~numpy.isnan(zi)], (y, x), method='cubic')
zi = zi[:(M+1),:(N+1)][::-1,::-1]

pylab.subplot(1,2,1)
pylab.imshow(z, origin='lower')
pylab.subplot(1,2,2)
pylab.imshow(zi, origin='lower')
pylab.show()

如果内存不足,您可以按照以下方式拆分数据:

def large_griddata(data_x, vals, grid, method='nearest'):
x, y = data_x
X, Y = grid
try:
return interpolate.griddata((x,y),vals,(X,Y),method=method)
except MemoryError:
pass

N = (np.min(X)+np.max(X))/2.
M = (np.min(Y)+np.max(Y))/2.

masks = [(x<N) & (y<M),
(x<N) & (y>=M),
(x>=N) & (y<M),
(x>=N) & (y>=M)]

grid_mask = [(X<N) & (Y<M),
(X<N) & (Y>=M),
(X>=N) & (Y<M),
(X>=N) & (Y>=M)]

Z = np.zeros_like(X)
for i in range(4):
Z[grid_mask[i]] = large_griddata((x[masks[i]], y[masks[i]]),
vals[masks[i]], (X[grid_mask[i]], Y[grid_mask[i]]), method=method)

return Z

关于python - 具有 NaN 值或掩码的大型数组的双变量结构化插值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15485343/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com