gpt4 book ai didi

python - Scikit learn + Pandas ValueError : shapes (1, 1) 和 (10,10) 未对齐

转载 作者:行者123 更新时间:2023-11-30 09:45:17 25 4
gpt4 key购买 nike

我在使用 SciKit Learn 时遇到问题。

我正在做一个非常简单的线性回归问题。根据学习时数和最终成绩的输入值,我希望能够根据学生的学习时长来估计他们的成绩。

In [1]: import pandas as pd
In [2]: path = 'Desktop/hoursgrades.csv'
In [3]: df = pd.read_csv(path)
In [4]: X = df['Hours Studied']
In [5]: y = df['Grade']
In [6]: training_data_in = list()
In [7]: training_data_out = list()
In [8]: training_data_in.append(X)
In [9]: training_data_out.append(y)
In [11]: from sklearn.linear_model import LinearRegression
In [12]: model = LinearRegression(n_jobs =-1)
In [13]: model.fit(X = training_data_in, y = training_data_out)
Out[13]: LinearRegression(copy_X=True, fit_intercept=True, n_jobs=-1, normalize=False)

在此示例中,DF 如下所示:

In [16]: df
Out[16]:
Hours Studied Grade
0 1 10.0
1 2 20.0
2 3 30.0
3 4 40.0
4 5 50.0
5 6 60.0
6 7 70.0
7 8 80.0
8 9 90.0
9 10 100.0

X 看起来像这样:

In [17]: X
Out[17]:
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
Name: Hours Studied, dtype: int64

y 看起来像这样:

In [18]: y
Out[18]:
0 10.0
1 20.0
2 30.0
3 40.0
4 50.0
5 60.0
6 70.0
7 80.0
8 90.0
9 100.0
Name: Grade, dtype: float64

到目前为止一切顺利,它似乎已经接受了我到目前为止所输入的所有内容。所以现在,我想用一些输入数据来测试模型。所以,我想说,这个学生学习的小时数是 5,模型会告诉我预期的成绩。

但是当我将其放入模型中时,出现以下错误。

谁能给点建议吗?

In [14]: studied_hour = [[5]]

In [15]: outcome = model.predict(X = studied_hour)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-15-6fdab4ae2efd> in <module>()
----> 1 outcome = model.predict(X = studied_hour)

~/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/base.py in predict(self, X)
254 Returns predicted values.
255 """
--> 256 return self._decision_function(X)
257
258 _preprocess_data = staticmethod(_preprocess_data)

~/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/base.py in _decision_function(self, X)
239 X = check_array(X, accept_sparse=['csr', 'csc', 'coo'])
240 return safe_sparse_dot(X, self.coef_.T,
--> 241 dense_output=True) + self.intercept_
242
243 def predict(self, X):

~/anaconda3/lib/python3.7/site-packages/sklearn/utils/extmath.py in safe_sparse_dot(a, b, dense_output)
138 return ret
139 else:
--> 140 return np.dot(a, b)
141
142

ValueError: shapes (1,1) and (10,10) not aligned: 1 (dim 1) != 10 (dim 0)

我应该补充:

In [39]: X.shape
Out[39]: (10,)

In [40]: y.shape
Out[40]: (10,)

最佳答案

Xy 的输入形状都不正确,X 必须为 (n_samples, n_features) (n_samples,) 对于 y,按照 docs .

您看到错误是因为模型认为您有十个特征和十个不同的输出(因此是 (10, 10))。

使用以下命令可以获得正确的结果

X = df[['Hours Studied']]  # note the double brackets, shape (10, 1)
y = df['Grade']
model = LinearRegression().fit(X, y)

model.predict([[5]])
array([50.])

关于python - Scikit learn + Pandas ValueError : shapes (1, 1) 和 (10,10) 未对齐,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53355338/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com