gpt4 book ai didi

python - 如何查找 dask 数组分区的行索引

转载 作者:太空宇宙 更新时间:2023-11-03 20:14:40 27 4
gpt4 key购买 nike

我有一个 2D (4950, 4950) dask 数组,我想并行计算。使用链接:https://docs.dask.org/en/latest/delayed-best-practices.html#don-t-call-dask-delayed-on-other-dask-collections

print(da.shape)
partitions = da.to_delayed()
print(partitions)
delayed_values = [dask.delayed(funct)(part) for part in partitions]
print(delayed_values)

我得到的结果是:

(4950, 4950)
[[Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 0, 0))
Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 0, 1))
Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 0, 2))
Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 0, 3))]
[Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 1, 0))
Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 1, 1))
Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 1, 2))
Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 1, 3))]
[Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 2, 0))
Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 2, 1))
Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 2, 2))
Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 2, 3))]
[Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 3, 0))
Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 3, 1))
Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 3, 2))
Delayed(('gt-f3b8d1635832fc9b88447def18b4b7d0', 3, 3))]]
[Delayed('funct-c0044e9f-4b8e-4d02-b364-f6a483eaae2f'),
Delayed('funct-d2d14dcd-6f0a-4198-b999-221b0609bcaa'),
Delayed('funct-1951008c-14f4-43da-bbc1-443e90aae029'),
Delayed('funct-a254e3ba-2d45-45f8-bae4-85ba8c37a32f')]

我想计算出每个分区的行索引(第一个和最后一个索引),以将每个索引的计算结果保存在最终输出文件中。

我找不到太多与分区相关的文档,非常感谢任何可以帮助查找行索引的帮助/链接。

最佳答案

对于 Dask 数组,您需要查看 .chunks 属性。我特别认为您可能会想要类似的东西

[np.cumsum(c) for c in x.chunks]

有关详细信息,请参阅 https://docs.dask.org/en/latest/array-design.html#chunks

关于python - 如何查找 dask 数组分区的行索引,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58531181/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com