gpt4 book ai didi

python - 将 numpy 数组转换为 dask 数据框列?

转载 作者:行者123 更新时间:2023-12-05 01:39:47 25 4
gpt4 key购买 nike

我有一个 numpy 数组,我想将其作为列添加到现有的 dask 数据框中。

enc = LabelEncoder()
nparr = enc.fit_transform(X[['url']])

我有 dask 数据帧类型的 ddf。

ddf['nurl'] = nparr   ???

有什么优雅的方法可以实现上述目标吗?

Python PANDAS: Converting from pandas/numpy to dask dataframe/array This does not solve my issue as i want numpy array into existing dask dataframe.

最佳答案

您可以将 numpy 数组转换为 dask Series 对象,然后将其合并到数据框。您将需要使用 Series 对象的 .to_frame() 方法,因为它只支持将数据帧与其他数据帧合并。

import dask.dataframe as dd
import numpy as np
import pandas as pd

df = pd.DataFrame({'x': range(30), 'y': range(0,300, 10)})
arr = np.random.randint(0, 100, size=30)

# create dask frame and series
ddf = ddf = dd.from_pandas(df, npartitions=5)
darr = dd.from_array(arr)
# give it a name to use as a column head
darr.name = 'z'

ddf2 = ddf.merge(darr.to_frame())

ddf2
# returns:
Dask DataFrame Structure:
x y z
npartitions=5
0 int64 int64 int32
6 ... ... ...
... ... ... ...
24 ... ... ...
29 ... ... ...
Dask Name: join-indexed, 33 tasks

关于python - 将 numpy 数组转换为 dask 数据框列?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57607155/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com