gpt4 book ai didi

python - sklearn标签编码器: TypeError : '<' not supported between instances of 'int' and 'str'

转载 作者:行者123 更新时间:2023-11-30 09:17:22 24 4
gpt4 key购买 nike

我想使用KNN算法进行文本分类。我有扩展名为 .csv 的数据。 enter image description here

如果我使用此代码打印,数据将如下所示:

# Preprocessing

X = np.array(dataset.iloc[:, :1])
y = np.array(dataset['Class'])

print("Data variabel X : ", X)
print("Data variabel y : ", y)

输出:

[['pada awalnya aku memandang gadis itu nani namanya']['dua buah melon yang subur segar']]['Pornografi''Non-Pornografi']

我分开进行训练和测试:

# Train Test Split

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20)

# loading library
from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import LabelEncoder

# Feature Scaling
lb = LabelEncoder()
lb.fit(X_train)

X_train = lb.transform(X_train)
X_test = lb.transform(X_test)

print("X_train : ", X_train)
print("X_test : ", X_test)

# instantiate learning model (k = 3)
knn = KNeighborsClassifier(n_neighbors=3)

# fitting the model
knn.fit([[X_train, y_train]], [y])

# predict the response
pred = knn.predict(X_test)

# evaluate accuracy
print (accuracy_score(y_test, pred))

我收到错误消息:

    <ipython-input-223-7d80eb4ea7d1> in <module>()
8
9 X_train = lb.transform(X_train)
---> 10 X_test = lb.transform(X_test)
11
12 print("X_train : ", X_train)
TypeError: '<' not supported between instances of 'int' and 'str'

我的代码有什么问题吗?

最佳答案

试试这个:

lb.transform(X_test.astype(str))

基本上,您需要将数据转换为一种格式。

关于python - sklearn标签编码器: TypeError : '<' not supported between instances of 'int' and 'str' ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51758409/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com