gpt4 book ai didi

python - 多标签二值化器 : float object not iterable

转载 作者:行者123 更新时间:2023-11-30 09:05:49 29 4
gpt4 key购买 nike

我有以下数据框

df[['row_num','set_id']].head()

row_num path_id_set
988681 [31672, 0]
988680 [31965, 0]
988679 [0, 78464]

我正在尝试使用多标签二值化器,但失败并出现错误代码 float object not iterable

from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer()
mlb.fit_transform(df['set_id'].str.split(','))

TypeError: 'float' object is not iterable

最佳答案

我认为问题是缺少值,您可以使用:

print (df)
row_num set_id
0 988681 NaN
1 988680 [31965,0]
2 988679 [0,78464]

from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer()

#create boolean mask matched non NaNs values
mask = df['set_id'].notnull()

#filter by boolean indexing
arr = mlb.fit_transform(df.loc[mask, 'set_id'].dropna().str.strip('[]').str.split(','))

#create DataFrame and add missing (NaN)s index values
df = (pd.DataFrame(arr, index=df.index[mask], columns=mlb.classes_)
.reindex(df.index, fill_value=0))

print (df)
0 31965 78464
0 0 0 0
1 1 1 0
2 1 0 1

关于python - 多标签二值化器 : float object not iterable,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52498640/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com