gpt4 book ai didi

python - 如何使用 XGBoost 获取 Predictions 和使用 Scikit-Learn Wrapper 的 XGBoost 进行匹配?

转载 作者:太空狗 更新时间:2023-10-29 21:33:25 27 4
gpt4 key购买 nike

我是 Python 中 XGBoost 的新手,所以如果这里的答案很明显,我深表歉意,但我正在尝试使用 panda 数据框并在 Python 中获取 XGBoost 来给我使用 Scikit-Learn 包装器时得到的相同预测对于同一个练习。到目前为止,我一直无法这样做。举个例子,这里我使用波士顿数据集,转换为 Pandas 数据框,训练数据集的前 500 个观察值,然后预测最后 6 个。我先用 XGBoost 做,然后用 Scikit-Learn 包装器和即使我将模型的参数设置为相同,我也会得到不同的预测。具体来说,数组预测看起来与数组预测 2 非常不同(请参见下面的代码)。任何帮助将不胜感激!

from sklearn import datasets
import pandas as pd
import xgboost as xgb
from xgboost.sklearn import XGBClassifier
from xgboost.sklearn import XGBRegressor

### Use the boston data as an example, train on first 500, predict last 6
boston_data = datasets.load_boston()
df_boston = pd.DataFrame(boston_data.data,columns=boston_data.feature_names)
df_boston['target'] = pd.Series(boston_data.target)


#### Code using XGBoost
Sub_train = df_boston.head(500)
target = Sub_train["target"]
Sub_train = Sub_train.drop('target', axis=1)

Sub_predict = df_boston.tail(6)
Sub_predict = Sub_predict.drop('target', axis=1)

xgtrain = xgb.DMatrix(Sub_train.as_matrix(), label=target.tolist())
xgtest = xgb.DMatrix(Sub_predict.as_matrix())

params = {'booster': 'gblinear', 'objective': 'reg:linear',
'max_depth': 2, 'learning_rate': .1, 'n_estimators': 500, 'min_child_weight': 3, 'colsample_bytree': .7,
'subsample': .8, 'gamma': 0, 'reg_alpha': 1}

model = xgb.train(dtrain=xgtrain, params=params)

predictions = model.predict(xgtest)

#### Code using Sk learn Wrapper for XGBoost
model = XGBRegressor(learning_rate =.1, n_estimators=500,
max_depth=2, min_child_weight=3, gamma=0,
subsample=.8, colsample_bytree=.7, reg_alpha=1,
objective= 'reg:linear')

target = "target"

Sub_train = df_boston.head(500)
Sub_predict = df_boston.tail(6)
Sub_predict = Sub_predict.drop('target', axis=1)

Ex_List = ['target']

predictors = [i for i in Sub_train.columns if i not in Ex_List]

model = model.fit(Sub_train[predictors],Sub_train[target])

predictions2 = model.predict(Sub_predict)

最佳答案

请看this answer here

xgboost.train will ignore parameter n_estimators, while xgboost.XGBRegressor accepts. In xgboost.train, boosting iterations (i.e. n_estimators) is controlled by num_boost_round(default: 10)

它建议从提供给 xgb.train 的参数中删除 n_estimators 并将其替换为 num_boost_round

所以像这样改变你的参数:

params = {'objective': 'reg:linear', 
'max_depth': 2, 'learning_rate': .1,
'min_child_weight': 3, 'colsample_bytree': .7,
'subsample': .8, 'gamma': 0, 'alpha': 1}

然后像这样训练 xgb.train:

model = xgb.train(dtrain=xgtrain, params=params,num_boost_round=500)

你会得到相同的结果。

或者,保持 xgb.train 不变并像这样更改 XGBRegressor:

model = XGBRegressor(learning_rate =.1, n_estimators=10,
max_depth=2, min_child_weight=3, gamma=0,
subsample=.8, colsample_bytree=.7, reg_alpha=1,
objective= 'reg:linear')

那么你也会得到同样的结果。

关于python - 如何使用 XGBoost 获取 Predictions 和使用 Scikit-Learn Wrapper 的 XGBoost 进行匹配?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46943674/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com