gpt4 book ai didi

python - SMOTE - 无法将字符串转换为 float

转载 作者:行者123 更新时间:2023-12-05 01:32:51 31 4
gpt4 key购买 nike

我想我在下面的代码中遗漏了一些东西。

from sklearn.model_selection import train_test_split
from imblearn.over_sampling import SMOTE


# Split into training and test sets

# Testing Count Vectorizer

X = df[['Spam']]
y = df['Value']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=40)
X_resample, y_resampled = SMOTE().fit_resample(X_train, y_train)


sm = pd.concat([X_resampled, y_resampled], axis=1)

因为我收到错误

ValueError: could not convert string to float:---> 19 X_resampled, y_resampled = SMOTE().fit_resample(X_train, y_train)

数据示例是

Spam                                             Value
Your microsoft account was compromised 1
Manchester United lost against PSG 0
I like cooking 0

我会考虑转换训练集和测试集来解决导致错误的问题,但我不知道如何同时应用到这两者。我已经在谷歌上尝试了一些例子,但它并没有解决问题。

最佳答案

在应用 SMOTE 之前将文本数据转换为数字,如下所示。

from sklearn.feature_extraction.text import CountVectorizer

vectorizer = CountVectorizer()
vectorizer.fit(X_train.values.ravel())
X_train=vectorizer.transform(X_train.values.ravel())
X_test=vectorizer.transform(X_test.values.ravel())
X_train=X_train.toarray()
X_test=X_test.toarray()

然后添加你的SMOTE代码

x_train = pd.DataFrame(X_train)
X_resample, y_resampled = SMOTE().fit_resample(X_train, y_train)

关于python - SMOTE - 无法将字符串转换为 float ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65280842/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com