gpt4 book ai didi

dask 数据框 set_index 抛出错误

转载 作者:行者123 更新时间:2023-12-01 14:59:32 28 4
gpt4 key购买 nike

我有一个从 HDFS 上的 parquet 文件创建的 dask 数据框。使用 api: set_index 创 build 置索引时,它失败并出现以下错误。

File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/dask/dataframe/shuffle.py", line 64, in set_index divisions, sizes, mins, maxes = base.compute(divisions, sizes, mins, maxes) File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/dask/base.py", line 206, in compute results = get(dsk, keys, **kwargs) File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/distributed/client.py", line 1949, in get results = self.gather(packed, asynchronous=asynchronous) File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/distributed/client.py", line 1391, in gather asynchronous=asynchronous) File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/distributed/client.py", line 561, in sync return sync(self.loop, func, *args, **kwargs) File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/distributed/utils.py", line 241, in sync six.reraise(*error[0]) File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/six.py", line 693, in reraise raise value File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/distributed/utils.py", line 229, in f result[0] = yield make_coro() File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "", line 4, in raise_exc_info File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/distributed/client.py", line 1269, in _gather traceback) File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/six.py", line 692, in reraise raise value.with_traceback(tb) File "/ebs/d1/agent/conda/envs/py361/lib/python3.6/site-packages/dask/dataframe/io/parquet.py", line 144, in _read_parquet_row_group open=open, assign=views, scheme=scheme) TypeError: read_row_group_file() got an unexpected keyword argument 'scheme'

谁能告诉我这个错误的原因以及如何解决它。

最佳答案

解决方案

将 fastparquet 升级到 0.1.3 版本。

详情

用于您的示例的 Dask 0.15.4 包括 this commit ,它将参数 scheme 添加到 read_row_group_file()。对于 0.1.3 之前的 fastparquet 版本,这会引发错误。

关于dask 数据框 set_index 抛出错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46626670/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com