gpt4 book ai didi

Python 如何格式化数据以便 scikit-learn 允许我调用 .fit(X,y) 函数?

转载 作者:行者123 更新时间:2023-11-30 22:38:22 25 4
gpt4 key购买 nike

我不会要求你们所有人帮忙,但我已经花了很多时间试图找出我做错了什么并且惨败。我正在尝试使用 python 中的 scikit-learn 库收集的一些数据来训练神经网络。

我用作引用的网站:http://scikit-learn.org/stable/modules/neural_networks_supervised.html

我的training_x数据最终是一个数组数组,看起来与此类似:

[[0.1, 0.2, -0.1], [0.21, -0.32, 0.3]]

对于training_y,它是一个 float 组,如下所示:[0.3,0.2]

training_x = []
training_y = []
for day_offset in range(int((end_date - start_date).days) + 1):
curr_day = start_date + timedelta(day_offset)
for company in companies:
output_training_data(cursor, training_x, training_y, company, curr_day)

clf = MLPClassifier(solver='adam', alpha=1e-5, hidden_layer_sizes=(5, 3), random_state=1)
clf.fit(training_x, training_y)

然后我收到以下错误:

Traceback (most recent call last):
File "/Users/jodymcadams/Documents/GitHub/moneygen/create_training_data.py", line 194, in <module>
main()
File "/Users/jodymcadams/Documents/GitHub/moneygen/create_training_data.py", line 191, in main
update_data(app_config, companies)
File "/Users/jodymcadams/Documents/GitHub/moneygen/create_training_data.py", line 169, in update_data
update_tweets(app_config, companies)
File "/Users/jodymcadams/Documents/GitHub/moneygen/create_training_data.py", line 154, in update_tweets
process_twitter(cursor, companies)
File "/Users/jodymcadams/Documents/GitHub/moneygen/create_training_data.py", line 136, in process_twitter
clf.fit(training_x, training_y)
File "/usr/local/lib/python2.7/site-packages/sklearn/neural_network/multilayer_perceptron.py", line 618, in fit
return self._fit(X, y, incremental=False)
File "/usr/local/lib/python2.7/site-packages/sklearn/neural_network/multilayer_perceptron.py", line 330, in _fit
X, y = self._validate_input(X, y, incremental)
File "/usr/local/lib/python2.7/site-packages/sklearn/neural_network/multilayer_perceptron.py", line 908, in _validate_input
self._label_binarizer.fit(y)
File "/usr/local/lib/python2.7/site-packages/sklearn/preprocessing/label.py", line 304, in fit
self.classes_ = unique_labels(y)
File "/usr/local/lib/python2.7/site-packages/sklearn/utils/multiclass.py", line 98, in unique_labels
raise ValueError("Unknown label type: %s" % repr(ys))
ValueError: Unknown label type: (array([ -8.60708650e-04, -1.63581100e-03, 9.93761387e-04,
3.86313466e-04, 4.85415472e-04, 9.92796708e-05,
-7.66657374e-04, -1.60558464e-03, 2.50678922e-03,
-9.75813759e-04, -1.11646082e-03, -2.30801511e-03,
-1.48148148e-03, -2.47524752e-03, 9.89119683e-04,
-4.94804552e-04, 4.94559842e-04, -9.90099010e-04,
2.72479564e-03, -2.36707939e-03, -3.64298725e-04,
1.36425648e-03, -1.81933958e-04, -5.12023407e-03,

最佳答案

您的标签必须是整数。 float 标签不能是唯一的。

将“分类”视为查找从输入到输出的离散映射的任务。将“回归”视为寻找从输入到输出的连续映射的任务。由于您的标签是 float 的,在我看来您正在尝试进行回归。

如果是这样,请考虑使用 MLPRegressor 代替。

关于Python 如何格式化数据以便 scikit-learn 允许我调用 .fit(X,y) 函数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43579320/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com