gpt4 book ai didi

Python Xarray 将 DataArray 添加到数据集

转载 作者:行者123 更新时间:2023-12-04 17:16:48 28 4
gpt4 key购买 nike

很简单的问题,但我在网上找不到答案。我有一个 Dataset我只想添加一个名为 DataArray到它。类似 dataset.add({"new_array": new_data_array}) .我知道 mergeupdateconcatenate ,但我的理解是 merge用于合并两个或多个 Dataset s 和 concatenate用于连接两个或多个 DataArray s 形成另一个 DataArray ,我还没有完全理解update然而。我试过 dataset.update({"new_array": new_data_array})但我收到以下错误。

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

我也试过 dataset["new_array"] = new_data_array我得到了同样的错误。

更新

我现在发现问题是我的一些坐标有重复的值,这是我不知道的。坐标用作索引,因此 Xarray 在尝试组合共享坐标时会感到困惑(可以理解)。下面是一个有效的例子。
names = ["joaquin", "manolo", "xavier"]
n = xarray.DataArray([23, 98, 23], coords={"name": names})
print(n)
print("======")
m = numpy.random.randint(0, 256, (3, 4, 4)).astype(numpy.uint8)
mm = xarray.DataArray(m, dims=["name", "row", "column"], coords=[names, range(4), range(4)])
print(mm)
print("======")
n_dataset = n.rename("number").to_dataset()
n_dataset["mm"] = mm
print(n_dataset)

输出:
<xarray.DataArray (name: 3)>
array([23, 98, 23])
Coordinates:
* name (name) <U7 'joaquin' 'manolo' 'xavier'
======
<xarray.DataArray (name: 3, row: 4, column: 4)>
array([[[ 55, 63, 250, 211],
[204, 151, 164, 237],
[182, 24, 211, 12],
[183, 220, 35, 78]],

[[208, 7, 91, 114],
[195, 30, 108, 130],
[ 61, 224, 105, 125],
[ 65, 1, 132, 137]],

[[ 52, 137, 62, 206],
[188, 160, 156, 126],
[145, 223, 103, 240],
[141, 38, 43, 68]]], dtype=uint8)
Coordinates:
* name (name) <U7 'joaquin' 'manolo' 'xavier'
* row (row) int64 0 1 2 3
* column (column) int64 0 1 2 3
======
<xarray.Dataset>
Dimensions: (column: 4, name: 3, row: 4)
Coordinates:
* name (name) object 'joaquin' 'manolo' 'xavier'
* row (row) int64 0 1 2 3
* column (column) int64 0 1 2 3
Data variables:
number (name) int64 23 98 23
mm (name, row, column) uint8 55 63 250 211 204 151 164 237 182 24 ...

以上代码使用 names作为索引。如果我稍微改变一下代码,那么 names有重复,比如 names = ["joaquin", "manolo", "joaquin"] ,然后我得到一个 InvalidIndexError .

代码:
names = ["joaquin", "manolo", "joaquin"]
n = xarray.DataArray([23, 98, 23], coords={"name": names})
print(n)
print("======")
m = numpy.random.randint(0, 256, (3, 4, 4)).astype(numpy.uint8)
mm = xarray.DataArray(m, dims=["name", "row", "column"], coords=[names, range(4), range(4)])
print(mm)
print("======")
n_dataset = n.rename("number").to_dataset()
n_dataset["mm"] = mm
print(n_dataset)

输出:
<xarray.DataArray (name: 3)>
array([23, 98, 23])
Coordinates:
* name (name) <U7 'joaquin' 'manolo' 'joaquin'
======
<xarray.DataArray (name: 3, row: 4, column: 4)>
array([[[247, 3, 20, 141],
[ 54, 111, 224, 56],
[144, 117, 131, 192],
[230, 44, 174, 14]],

[[225, 184, 170, 248],
[ 57, 105, 165, 70],
[220, 228, 238, 17],
[ 90, 118, 87, 30]],

[[158, 211, 31, 212],
[ 63, 172, 190, 254],
[165, 163, 184, 22],
[ 49, 224, 196, 244]]], dtype=uint8)
Coordinates:
* name (name) <U7 'joaquin' 'manolo' 'joaquin'
* row (row) int64 0 1 2 3
* column (column) int64 0 1 2 3
======
---------------------------------------------------------------------------
InvalidIndexError Traceback (most recent call last)
<ipython-input-12-50863379cefe> in <module>()
8 print("======")
9 n_dataset = n.rename("number").to_dataset()
---> 10 n_dataset["mm"] = mm
11 print(n_dataset)

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/xarray/core/dataset.py in __setitem__(self, key, value)
536 raise NotImplementedError('cannot yet use a dictionary as a key '
537 'to set Dataset values')
--> 538 self.update({key: value})
539
540 def __delitem__(self, key):

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/xarray/core/dataset.py in update(self, other, inplace)
1434 dataset.
1435 """
-> 1436 variables, coord_names, dims = dataset_update_method(self, other)
1437
1438 return self._replace_vars_and_dims(variables, coord_names, dims,

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/xarray/core/merge.py in dataset_update_method(dataset, other)
492 priority_arg = 1
493 indexes = dataset.indexes
--> 494 return merge_core(objs, priority_arg=priority_arg, indexes=indexes)

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/xarray/core/merge.py in merge_core(objs, compat, join, priority_arg, explicit_coords, indexes)
373 coerced = coerce_pandas_values(objs)
374 aligned = deep_align(coerced, join=join, copy=False, indexes=indexes,
--> 375 skip_single_target=True)
376 expanded = expand_variable_dicts(aligned)
377

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/xarray/core/alignment.py in deep_align(list_of_variable_maps, join, copy, indexes, skip_single_target)
162
163 aligned = partial_align(*targets, join=join, copy=copy, indexes=indexes,
--> 164 skip_single_target=skip_single_target)
165
166 for key, aligned_obj in zip(keys, aligned):

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/xarray/core/alignment.py in partial_align(*objects, **kwargs)
122 valid_indexers = dict((k, v) for k, v in joined_indexes.items()
123 if k in obj.dims)
--> 124 result.append(obj.reindex(copy=copy, **valid_indexers))
125
126 return tuple(result)

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/xarray/core/dataset.py in reindex(self, indexers, method, tolerance, copy, **kw_indexers)
1216
1217 variables = alignment.reindex_variables(
-> 1218 self.variables, self.indexes, indexers, method, tolerance, copy=copy)
1219 return self._replace_vars_and_dims(variables)
1220

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/xarray/core/alignment.py in reindex_variables(variables, indexes, indexers, method, tolerance, copy)
234 target = utils.safe_cast_to_index(indexers[name])
235 indexer = index.get_indexer(target, method=method,
--> 236 **get_indexer_kwargs)
237
238 to_shape[name] = len(target)

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/indexes/base.py in get_indexer(self, target, method, limit, tolerance)
2080
2081 if not self.is_unique:
-> 2082 raise InvalidIndexError('Reindexing only valid with uniquely'
2083 ' valued Index objects')
2084

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

所以这不是 Xarray 中的错误。尽管如此,我还是浪费了很多时间试图找到这个错误,我希望错误消息能提供更多信息。我希望 Xarray 的合作者能尽快解决这个问题。 (在尝试合并之前对坐标进行唯一性检查。)

无论如何,我下面的回答提供的方法仍然有效。

最佳答案

感谢您的详细报告,此问题现已在最新版本的 xarray (v0.8.2) 中得到修复。

我们通过两种方式修复了该行为:

  • xarray 对象之间的对齐操作现在即使使用非唯一索引也能成功,只要非唯一索引在所有对象上采用相同的值。
  • 如果您尝试将对象与不相同的非唯一索引对齐,您现在会收到一条信息性错误消息,报告具有重复值的索引名称,例如 ValueError: cannot reindex or align along dimension 'x' because the index has duplicate values .
  • 关于Python Xarray 将 DataArray 添加到数据集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38826505/

    28 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com