gpt4 book ai didi

python - sklearn.cross_validation 中的错误

转载 作者:太空宇宙 更新时间:2023-11-04 08:52:32 24 4
gpt4 key购买 nike

使用 LeaveOneOutsklearn.cross_validation 中可能存在错误。x_testy_test 没有在 LeaveOneOut 中使用。相反,验证是使用 x_trainy_train 完成的。

import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.cross_validation import LeaveOneOut, cross_val_predict

x = np.array([[1,2],[3,4],[5,6],[7,8],[9,10]])
y = np.array([12,13,19,18,15])
clf = LinearRegression().fit(x,y)
cv = LeaveOneOut(len(y))
for train, test in cv:
x_train, y_train = x[train], y[train]
x_test, y_test = x[test], y[test]
y_pred_USING_x_test = clf.predict(x_test)
y_pred_USING_x_train = clf.predict(x_train)
print 'y_pred_USING_x_test: ', y_pred_USING_x_test, 'y_pred_USING_x_train: ', y_pred_USING_x_train



y_pred_USING_x_test: [ 13.2] y_pred_USING_x_train: [ 14.3 15.4 16.5 17.6]
y_pred_USING_x_test: [ 14.3] y_pred_USING_x_train: [ 13.2 15.4 16.5 17.6]
y_pred_USING_x_test: [ 15.4] y_pred_USING_x_train: [ 13.2 14.3 16.5 17.6]
y_pred_USING_x_test: [ 16.5] y_pred_USING_x_train: [ 13.2 14.3 15.4 17.6]
y_pred_USING_x_test: [ 17.6] y_pred_USING_x_train: [ 13.2 14.3 15.4 16.5]

y_pred_USING_x_test 在每个 for 循环中给出一个值,这没有任何意义!

y_pred_USING_x_train 是使用 LeaveOneOut 寻找的。

以下代码的结果完全无关紧要!

bug = cross_val_predict(clf, x, y, cv=cv)
print 'bug: ', bug
bug: [ 15. 14.85714286 14.5 15.85714286 21.5 ]

欢迎任何辩护。

最佳答案

根据 LeaveOneOut

Each sample is used once as a test set (singleton)

这意味着 x_test 将是一个包含一个元素的数组,而 clf.predict(x_test) 将返回一个包含一个(预测)元素的数组。这在您的输出中可以看到。

x_train 将是没有为 x_test 选择的一个元素的训练集。这可以通过在 for 循环中添加以下行来确认

for train, test in cv:
x_train, y_train = x[train], y[train]
x_test, y_test = x[test], y[test]
if len(x_test)!=1 or ( len(x_train)+1!=len(x) ): # Confirmation
raise Exception
y_pred_USING_x_test = clf.predict(x_test)
y_pred_USING_x_train = clf.predict(x_train)
print 'predicting for',x_test,'and expecting',y_test, 'and got', y_pred_USING_x_test
print 'predicting for',x_train,'and expecting',y_train, 'and got', y_pred_USING_x_train
print
print




注意 这不是正确的验证,因为您是在同一数据上训练和测试您的模型。您应该在 for 循环的迭代中创建新的 LinearRegression 对象,并使用 x_trainy_train 对其进行训练。使用它来预测 x_test 然后比较 y_testy_pred_USING_x_test

x = np.array([[1,2],[3,4],[5,6],[7,8],[9,10]])             
y = np.array([12,13,19,18,15])
cv = LeaveOneOut(len(y))
for train, test in cv:
x_train, y_train = x[train], y[train]
x_test, y_test = x[test], y[test]
if len(x_test)!=1 or ( len(x_train)+1!=len(x) ):
raise Exception
clf = LinearRegression()
clf.fit(x_train, y_train)
y_pred_USING_x_test = clf.predict(x_test)
print 'predicting for',x_test,'and expecting',y_test, 'and got', y_pred_USING_x_test

关于python - sklearn.cross_validation 中的错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33655121/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com