gpt4 book ai didi

python - 使用 scikit-learn 管道与手动执行时的分数不同

转载 作者:行者123 更新时间:2023-12-01 07:30:23 26 4
gpt4 key购买 nike

下面是使用 minmaxscaler、polyl 特征和线性回归分类器的简单示例。

通过管道进行:

pipeLine = make_pipeline(MinMaxScaler(),PolynomialFeatures(), LinearRegression())

pipeLine.fit(X_train,Y_train)
print(pipeLine.score(X_test,Y_test))
print(pipeLine.steps[2][1].intercept_)
print(pipeLine.steps[2][1].coef_)

0.4433729905419167
3.4067909278765605
[ 0. -7.60868833 5.87162697]

手动执行:

X_trainScaled = MinMaxScaler().fit_transform(X_train)
X_trainScaledandPoly = PolynomialFeatures().fit_transform(X_trainScaled)

X_testScaled = MinMaxScaler().fit_transform(X_test)
X_testScaledandPoly = PolynomialFeatures().fit_transform(X_testScaled)

reg = LinearRegression()
reg.fit(X_trainScaledandPoly,Y_train)
print(reg.score(X_testScaledandPoly,Y_test))
print(reg.intercept_)
print(reg.coef_)
print(reg.intercept_ == pipeLine.steps[2][1].intercept_)
print(reg.coef_ == pipeLine.steps[2][1].coef_)

0.44099256691782807
3.4067909278765605
[ 0. -7.60868833 5.87162697]
True
[ True True True]

最佳答案

问题出在您的手动步骤中,您使用测试数据对 Scaler 进行重新拟合,您需要将其拟合到训练数据上并在测试数据上使用拟合实例,请参阅此处了解详细信息:How to normalize the Train and Test data using MinMaxScaler sklearnStandardScaler before and after splitting data

from sklearn.datasets import make_classification, make_regression
from sklearn.preprocessing import MinMaxScaler, PolynomialFeatures
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline

X, y = make_regression(n_features=3, n_samples=50, n_informative=1, noise=1)
X_train, X_test, Y_train, Y_test = train_test_split(X, y)

pipeLine = make_pipeline(MinMaxScaler(),PolynomialFeatures(), LinearRegression())

pipeLine.fit(X_train,Y_train)
print(pipeLine.score(X_test,Y_test))
print(pipeLine.steps[2][1].intercept_)
print(pipeLine.steps[2][1].coef_)

scaler = MinMaxScaler().fit(X_train)
X_trainScaled = scaler.transform(X_train)
X_trainScaledandPoly = PolynomialFeatures().fit_transform(X_trainScaled)


X_testScaled = scaler.transform(X_test)
X_testScaledandPoly = PolynomialFeatures().fit_transform(X_testScaled)

reg = LinearRegression()
reg.fit(X_trainScaledandPoly,Y_train)
print(reg.score(X_testScaledandPoly,Y_test))
print(reg.intercept_)
print(reg.coef_)
print(reg.intercept_ == pipeLine.steps[2][1].intercept_)
print(reg.coef_ == pipeLine.steps[2][1].coef_)

关于python - 使用 scikit-learn 管道与手动执行时的分数不同,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57229489/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com