gpt4 book ai didi

python 掩码 netcdf 数据使用 shapefile

转载 作者:太空狗 更新时间:2023-10-30 02:24:50 49 4
gpt4 key购买 nike

我正在使用以下软件包:

import pandas as pd
import numpy as np
import xarray as xr
import geopandas as gpd

我有以下存储数据的对象:

print(precip_da)

Out[]:
<xarray.DataArray 'precip' (time: 13665, latitude: 200, longitude: 220)>
[601260000 values with dtype=float32]
Coordinates:
* longitude (longitude) float32 35.024994 35.074997 35.125 35.175003 ...
* latitude (latitude) float32 5.0249977 5.074997 5.125 5.174999 ...
* time (time) datetime64[ns] 1981-01-01 1981-01-02 1981-01-03 ...
Attributes:
standard_name: convective precipitation rate
long_name: Climate Hazards group InfraRed Precipitation with St...
units: mm/day
time_step: day
geostatial_lat_min: -50.0
geostatial_lat_max: 50.0
geostatial_lon_min: -180.0
geostatial_lon_max: 180.0

看起来如下:

precip_da.mean(dim="time").plot()

Mean precipitation over NE Ethiopia

我的 shapefile 为 geopandas.GeoDataFrame代表一个多边形。

awash = gpd.read_file(shp_dir)

awash
Out[]:
OID_ Name FolderPath SymbolID AltMode Base Clamped Extruded Snippet PopupInfo Shape_Leng Shape_Area geometry
0 0 Awash_Basin Awash_Basin.kml 0 0 0.0 -1 0 None None 30.180944 9.411263 POLYGON Z ((41.78939511000004 11.5539922500000...

看起来如下:

awash.plot()

Region shapefile stored as <code>geopandas.GeoDataFrame</code>

将一个绘制在另一个之上,它们看起来像这样:

ax = awash.plot(alpha=0.2, color='black')
precip_da.mean(dim="time").plot(ax=ax,zorder=-1)

Awash Region superimposed on precipitation data

我的问题是,如何屏蔽 xarray.DataArray通过检查经纬度点是否位于存储为 geopandas.GeoDataFrame 的 shapefile 内部?

所以我只想要落在该 shapefile 中的降水值(毫米/天)。

我想做如下的事情:

masked_precip = precip_da.within(awash)

masked_precip = precip_da.loc[precip_da.isin(awash)]

编辑 1

我考虑过使用 rasterio.mask module但我不知道输入数据需要什么格式。听起来好像它做了正确的事情:

使用输入形状创建一个屏蔽或填充数组。像素被屏蔽或设置为输入形状外的无数据

转自 GIS Stack Exchange here

最佳答案

这是我从 this gist 中获取的当前工作解决方案.这是 Stephan Hoyer 对 github issue 的回答对于 xarray 项目。

在上面的其他包之上,affinerasterio 都是必需的

from rasterio import features
from affine import Affine

def transform_from_latlon(lat, lon):
""" input 1D array of lat / lon and output an Affine transformation
"""
lat = np.asarray(lat)
lon = np.asarray(lon)
trans = Affine.translation(lon[0], lat[0])
scale = Affine.scale(lon[1] - lon[0], lat[1] - lat[0])
return trans * scale

def rasterize(shapes, coords, latitude='latitude', longitude='longitude',
fill=np.nan, **kwargs):
"""Rasterize a list of (geometry, fill_value) tuples onto the given
xray coordinates. This only works for 1d latitude and longitude
arrays.

usage:
-----
1. read shapefile to geopandas.GeoDataFrame
`states = gpd.read_file(shp_dir+shp_file)`
2. encode the different shapefiles that capture those lat-lons as different
numbers i.e. 0.0, 1.0 ... and otherwise np.nan
`shapes = (zip(states.geometry, range(len(states))))`
3. Assign this to a new coord in your original xarray.DataArray
`ds['states'] = rasterize(shapes, ds.coords, longitude='X', latitude='Y')`

arguments:
---------
: **kwargs (dict): passed to `rasterio.rasterize` function

attrs:
-----
:transform (affine.Affine): how to translate from latlon to ...?
:raster (numpy.ndarray): use rasterio.features.rasterize fill the values
outside the .shp file with np.nan
:spatial_coords (dict): dictionary of {"X":xr.DataArray, "Y":xr.DataArray()}
with "X", "Y" as keys, and xr.DataArray as values

returns:
-------
:(xr.DataArray): DataArray with `values` of nan for points outside shapefile
and coords `Y` = latitude, 'X' = longitude.


"""
transform = transform_from_latlon(coords[latitude], coords[longitude])
out_shape = (len(coords[latitude]), len(coords[longitude]))
raster = features.rasterize(shapes, out_shape=out_shape,
fill=fill, transform=transform,
dtype=float, **kwargs)
spatial_coords = {latitude: coords[latitude], longitude: coords[longitude]}
return xr.DataArray(raster, coords=spatial_coords, dims=(latitude, longitude))

def add_shape_coord_from_data_array(xr_da, shp_path, coord_name):
""" Create a new coord for the xr_da indicating whether or not it
is inside the shapefile

Creates a new coord - "coord_name" which will have integer values
used to subset xr_da for plotting / analysis/

Usage:
-----
precip_da = add_shape_coord_from_data_array(precip_da, "awash.shp", "awash")
awash_da = precip_da.where(precip_da.awash==0, other=np.nan)
"""
# 1. read in shapefile
shp_gpd = gpd.read_file(shp_path)

# 2. create a list of tuples (shapely.geometry, id)
# this allows for many different polygons within a .shp file (e.g. States of US)
shapes = [(shape, n) for n, shape in enumerate(shp_gpd.geometry)]

# 3. create a new coord in the xr_da which will be set to the id in `shapes`
xr_da[coord_name] = rasterize(shapes, xr_da.coords,
longitude='longitude', latitude='latitude')

return xr_da

可以这样实现:

precip_da = add_shape_coord_from_data_array(precip_da, shp_dir, "awash")
awash_da = precip_da.where(precip_da.awash==0, other=np.nan)
awash_da.mean(dim="time").plot()

The mean rainfall just in the Awash Basin of Ethiopia

关于python 掩码 netcdf 数据使用 shapefile,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51398563/

49 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com