gpt4 book ai didi

python - 值错误 : bad input shape in sklearn Python

转载 作者:太空宇宙 更新时间:2023-11-04 07:53:20 25 4
gpt4 key购买 nike

我有 2 个列表 featureslabelsfeatures 包含DiseasesAgeGenderPINlabels 包含 Health-Plan

用户传递user_input,格式为features。因此,代码应使用 sklearn API 的 DecisionTree 为用户预测健康计划。

features 中的参数很少是字符串。例如疾病性别。我正在使用 LabelEncoder 对它们进行编码以避免错误 'ValueError: could not convert string to float' 。

现在,在使用 Label Encoder 之后,我得到了以下异常 'ValueError: bad input shape'

我该如何解决这个问题并再次反转编码以避免 String to Float 错误。请帮忙。

from sklearn import tree
from sklearn.preprocessing import LabelEncoder
features = [['TB' , 28, 'MALE', 121001], ['TB' , 28, 'FEMALE', 121002], ['CANCER' , 28, 'MALE', 121001], ['CANCER' , 28, 'FEMALE', 121001]]
labels = ['X125434', 'X125436','X125437' , 'X125437']
user_input = ['TB' , 28, 'MALE', 121001]

le = LabelEncoder()

Y = le.fit_transform(features)
X = le.fit_transform(labels)
new_user_input = le.fit_transform(user_input)

clf = tree.DecisionTreeClassifier()
clf = clf.fit(new_features, new_labels)

print(clf.predict([new_ui]))

最佳答案

不建议对数据集中的所有特征使用相同的标签编码器。为每一列创建一个标签编码器是安全的,因为每个特征的值都不同。

from sklearn import tree
from sklearn.preprocessing import LabelEncoder
import pandas as pd

features = [['TB' , 28, 'MALE', 121001], ['TB' , 28, 'FEMALE', 121002], ['CANCER' , 28, 'MALE', 121001], ['CANCER' , 28, 'FEMALE', 121001]]
labels = ['X125434', 'X125436','X125437' , 'X125437']
feature_names=['Disease','Age','Gender','PIN']

user_input = ['TB' , 28, 'MALE', 121001]


train=pd.DataFrame(data=features,columns=['Disease','Age','Gender','PIN'])
train['Labels']=labels

test=pd.DataFrame(columns=['Disease','Age','Gender','PIN'])
test.loc[len(test)]=user_input

le_disease = LabelEncoder()
le_gender = LabelEncoder()
le_labels = LabelEncoder()

train['Disease'] = le_disease.fit_transform(train['Disease'])
train['Gender'] = le_gender.fit_transform(train['Gender'])
train['Labels'] = le_labels.fit_transform(train['Labels'])


test['Disease'] = le_disease.transform(test['Disease'])
test['Gender'] = le_gender.transform(test['Gender'])


clf = tree.DecisionTreeClassifier()
clf = clf.fit(train[feature_names], train['Labels'])

print(le_labels.inverse_transform(clf.predict(test[feature_names])))

LabelEncoder.inverse_transform() 可用于获取原始数据。

关于python - 值错误 : bad input shape in sklearn Python,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52112414/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com