gpt4 book ai didi

python - 如何使用 groupby 对 xarray 数据集进行下采样?

转载 作者:太空宇宙 更新时间:2023-11-04 07:31:30 25 4
gpt4 key购买 nike

我想根据特定组对 xarray 数据集进行下采样,因此我使用 groupby 来选择组,然后在每个组中抽取 10% 的样本。我正在使用下面的代码,但我得到 IndexError: index 1330 is out of bounds for axis 0 with size 1330 这表明我的函数正在返回一个空数组,但是 subset 肯定有非零维度。

我使用的是 squeeze=True,我认为这会根据 GroupBy documentation 允许新的维度但这没有帮助,所以我将其更改为 squeeze=False

你知道会发生什么吗?谢谢!

# Set random seed for reproducibility
np.random.seed(0)

def select_random_cell_subset(x):
size = int(0.1 * len(x.cell))
random_cells = sorted(np.random.choice(x.cell, size=size, replace=False))
print('number of random cells:', len(random_cells))
print('\tsome random cells:', random_cells[:5])
subset = x.sel(cell=random_cells)
print('subset:', subset)
return subset

# squeeze=False because the final dataset is smaller than the original
ds_subset = ds.groupby('group', squeeze=True).apply(select_random_cell_subset)
ds_subset

这里是错误:

---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-44-39c7803e9e40> in <module>()
12
13 # squeeze=False because the final dataset is smaller than the original
---> 14 ds_subset = ds.groupby('group', squeeze=True).apply(select_random_cell_subset)
15 ds_subset

~/anaconda3/envs/cshl-sca-2017/lib/python3.6/site-packages/xarray/core/groupby.py in apply(self, func, **kwargs)
615 kwargs.pop('shortcut', None) # ignore shortcut if set (for now)
616 applied = (func(ds, **kwargs) for ds in self._iter_grouped())
--> 617 return self._combine(applied)
618
619 def _combine(self, applied):

~/anaconda3/envs/cshl-sca-2017/lib/python3.6/site-packages/xarray/core/groupby.py in _combine(self, applied)
622 coord, dim, positions = self._infer_concat_args(applied_example)
623 combined = concat(applied, dim)
--> 624 combined = _maybe_reorder(combined, dim, positions)
625 if coord is not None:
626 combined[coord.name] = coord

~/anaconda3/envs/cshl-sca-2017/lib/python3.6/site-packages/xarray/core/groupby.py in _maybe_reorder(xarray_obj, dim, positions)
443 return xarray_obj
444 else:
--> 445 return xarray_obj[{dim: order}]
446
447

~/anaconda3/envs/cshl-sca-2017/lib/python3.6/site-packages/xarray/core/dataset.py in __getitem__(self, key)
716 """
717 if utils.is_dict_like(key):
--> 718 return self.isel(**key)
719
720 if hashable(key):

~/anaconda3/envs/cshl-sca-2017/lib/python3.6/site-packages/xarray/core/dataset.py in isel(self, drop, **indexers)
1141 for name, var in iteritems(self._variables):
1142 var_indexers = dict((k, v) for k, v in indexers if k in var.dims)
-> 1143 new_var = var.isel(**var_indexers)
1144 if not (drop and name in var_indexers):
1145 variables[name] = new_var

~/anaconda3/envs/cshl-sca-2017/lib/python3.6/site-packages/xarray/core/variable.py in isel(self, **indexers)
568 if dim in indexers:
569 key[i] = indexers[dim]
--> 570 return self[tuple(key)]
571
572 def squeeze(self, dim=None):

~/anaconda3/envs/cshl-sca-2017/lib/python3.6/site-packages/xarray/core/variable.py in __getitem__(self, key)
398 dims = tuple(dim for k, dim in zip(key, self.dims)
399 if not isinstance(k, integer_types))
--> 400 values = self._indexable_data[key]
401 # orthogonal indexing should ensure the dimensionality is consistent
402 if hasattr(values, 'ndim'):

~/anaconda3/envs/cshl-sca-2017/lib/python3.6/site-packages/xarray/core/indexing.py in __getitem__(self, key)
476 def __getitem__(self, key):
477 key = self._convert_key(key)
--> 478 return self._ensure_ndarray(self.array[key])
479
480 def __setitem__(self, key, value):

IndexError: index 1330 is out of bounds for axis 0 with size 1330

最佳答案

这是一件完全明智的事情,但遗憾的是它还没有奏效。 Xarray 使用一些启发式方法来确定 apply 操作是 reduce 还是 transform 类型,在这种情况下,我们错误地将分组操作识别为一个“转换”,因为输出重用了原始维度名称。我只是 filed a bug report但不幸的是,对 xarray 的修复会有所涉及。

可能最简单的解决方法是让应用函数返回一个 bool 值 DataArray,指示要保留的位置。然后您可以使用索引操作从原始对象中进行选择。

关于python - 如何使用 groupby 对 xarray 数据集进行下采样?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46498247/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com