gpt4 book ai didi

python - Dask agg函数 pickle 错误

转载 作者:行者123 更新时间:2023-12-01 15:55:57 25 4
gpt4 key购买 nike

我有以下 dask 数据框

@timestamp                        datetime64[ns]
@version object
dst object
dst_port object
host object
http_req_header_contentlength object
http_req_header_host object
http_req_header_referer object
http_req_header_useragent object
http_req_method object
http_req_secondleveldomain object
http_req_url object
http_req_version object
http_resp_code object
http_resp_header_contentlength object
http_resp_header_contenttype object
http_user object
local_time object
path object
src object
src_port object
tags object
type int64
dtype: object

我想通过操作得到一个分组

grouped_by_df = df.groupby(['http_user', 'src'])['@timestamp'].agg(['min', 'max']).reset_index()

运行 grouped_by_df.count().compute()` 时出现以下错误:

Traceback (most recent call last):
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-62-9acb48b4ac67>", line 1, in <module>
user_host_map.count().compute()
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/dask/base.py", line 98, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/dask/base.py", line 205, in compute
results = get(dsk, keys, **kwargs)
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/distributed/client.py", line 1893, in get
results = self.gather(packed)
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/distributed/client.py", line 1355, in gather
direct=direct, local_worker=local_worker)
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/distributed/client.py", line 531, in sync
return sync(self.loop, func, *args, **kwargs)
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/distributed/utils.py", line 234, in sync
six.reraise(*error[0])
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/distributed/utils.py", line 223, in f
result[0] = yield make_coro()
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/tornado/gen.py", line 1055, in run
value = future.result()
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/tornado/concurrent.py", line 238, in result
raise_exc_info(self._exc_info)
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/tornado/gen.py", line 1063, in run
yielded = self.gen.throw(*exc_info)
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/distributed/client.py", line 1235, in _gather
traceback)
File "/home/avlach/virtualenvs/dask/local/lib/python2.7/site-packages/distributed/protocol/pickle.py", line 59, in loads
return pickle.loads(x)
TypeError: itemgetter expected 1 arguments, got 0

我正在使用 dask 版本 0.15.1 和 LocalCLuster 客户端。是什么导致了这个问题?

最佳答案

我们刚刚遇到了类似的错误,我们正在运行以下形式的东西:

df[['col1','col2']].groupby('col1').agg("count")

并在最后得到类似的错误:

    return pickle.loads(x)
TypeError: itemgetter expected 1 arguments, got 0

但是当我们将 groupby 重新格式化为以下形式时:

df.groupby('col1')['col2'].count()

我们不再收到该错误。我们现在已经重复了几次,这似乎不仅仅是侥幸。完全不确定为什么会发生这种情况,但如果有人正在为同一问题而苦苦挣扎,则值得一试。

关于python - Dask agg函数 pickle 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47219532/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com