gpt4 book ai didi

python - 找到一种更简单的方法将二维散点数据聚类为网格阵列数据

转载 作者:行者123 更新时间:2023-11-28 22:39:32 24 4
gpt4 key购买 nike

我找到了一种将分散的点数据聚类成结构化二维数组的方法(如栅格化函数)。我希望有一些更好的方法来实现这个目标。

我的作品

1。简介

  • 1000 点数据 具有属性维度(经度、纬度、排放量),代表位于 (x,y) 的一家工厂向大气排放一定量的二氧化碳
  • grid network:预定义二维数组,形状为20x20

http://i4.tietuku.com/02fbaf32d2f09fff.png

此处转载代码:

#### define the map area
xc1,xc2,yc1,yc2 = 113.49805889531724,115.5030664238035,37.39995194888143,38.789235929357105
map = Basemap(llcrnrlon=xc1,llcrnrlat=yc1,urcrnrlon=xc2,urcrnrlat=yc2)

#### reading the point data and scatter plot by their position
df = pd.read_csv("xxxxx.csv")
px,py = map(df.lon, df.lat)
map.scatter(px, py, color = "red", s= 5,zorder =3)

#### predefine the grid networks
lon_grid,lat_grid = np.linspace(xc1,xc2,21), np.linspace(yc1,yc2,21)
lon_x,lat_y = np.meshgrid(lon_grid,lat_grid)
grids = np.zeros(20*20).reshape(20,20)
plt.pcolormesh(lon_x,lat_y,grids,cmap = 'gray', facecolor = 'none',edgecolor = 'k',zorder=3)

2。我的目标

  1. Finding the nearest grid point for each factory
  2. Add the emission data into this grid number

3。算法实现

3.1 光栅格

注意:20x20的网格点分布在这个以蓝点表示的区域。

http://i4.tietuku.com/8548554587b0cb3a.png

3.2 KD树

找到每个红点最近的蓝点

sh = (20*20,2)
grids = np.zeros(20*20*2).reshape(*sh)

sh_emission = (20*20)
grids_em = np.zeros(20*20).reshape(sh_emission)

k = 0
for j in range(0,yy.shape[0],1):
for i in range(0,xx.shape[0],1):
grids[k] = np.array([lon_grid[i],lat_grid[j]])
k+=1

T = KDTree(grids)

x_delta = (lon_grid[2] - lon_grid[1])
y_delta = (lat_grid[2] - lat_grid[1])
R = np.sqrt(x_delta**2 + y_delta**2)

for i in range(0,len(df.lon),1):
idx = T.query_ball_point([df.lon.iloc[i],df.lat.iloc[i]], r=R)
# there are more than one blue dot which are founded sometimes,
# So I'll calculate the distances between the factory(red point)
# and all blue dots which are listed
if (idx > 1):
distance = []
for k in range(0,len(idx),1):
distance.append(np.sqrt((df.lon.iloc[i] - grids[k][0])**2 + (df.lat.iloc[i] - grids[k][1])**2))
pos_index = distance.index(min(distance))
pos = idx[pos_index]

# Only find 1 point
else:
pos = idx
grids_em[pos] += df.so2[i]

4。结果

co2 = grids_em.reshape(20,20)
plt.pcolormesh(lon_x,lat_y,co2,cmap =plt.cm.Spectral_r,zorder=3)

http://i4.tietuku.com/6ded65c4ac301294.png

5。我的问题

  • 有人可以指出这种方法的一些缺点或错误吗?
  • 是否有一些算法更符合我的目标?

非常感谢!

最佳答案

你的代码中有很多 for-loop,这不是 numpy 的方式。

先做一些示例数据:

import numpy as np
import pandas as pd
from scipy.spatial import KDTree
import pylab as pl

xc1, xc2, yc1, yc2 = 113.49805889531724, 115.5030664238035, 37.39995194888143, 38.789235929357105

N = 1000
GSIZE = 20
x, y = np.random.multivariate_normal([(xc1 + xc2)*0.5, (yc1 + yc2)*0.5], [[0.1, 0.02], [0.02, 0.1]], size=N).T
value = np.ones(N)

df_points = pd.DataFrame({"x":x, "y":y, "v":value})

对于等间距网格,您可以使用 hist2d():

pl.hist2d(df_points.x, df_points.y, weights=df_points.v, bins=20, cmap="viridis");

这是输出:

enter image description here

下面是使用KdTree的代码:

X, Y = np.mgrid[x.min():x.max():GSIZE*1j, y.min():y.max():GSIZE*1j]

grid = np.c_[X.ravel(), Y.ravel()]
points = np.c_[df_points.x, df_points.y]

tree = KDTree(grid)
dist, indices = tree.query(points)

grid_values = df_points.groupby(indices).v.sum()

df_grid = pd.DataFrame(grid, columns=["x", "y"])
df_grid["v"] = grid_values

fig, ax = pl.subplots(figsize=(10, 8))
ax.plot(df_points.x, df_points.y, "kx", alpha=0.2)
mapper = ax.scatter(df_grid.x, df_grid.y, c=df_grid.v,
cmap="viridis",
linewidths=0,
s=100, marker="o")
pl.colorbar(mapper, ax=ax);

输出是:

enter image description here

关于python - 找到一种更简单的方法将二维散点数据聚类为网格阵列数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34668709/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com