gpt4 book ai didi

python - XGBClassifier 的交叉验证,用于 python 中的多类分类

转载 作者:太空狗 更新时间:2023-10-30 01:20:39 24 4
gpt4 key购买 nike

我正在尝试使用以下改编自 http://www.analyticsvidhya.com/blog/2016/03/complete-guide-parameter-tuning-xgboost-with-codes-python/ 的代码在 XGBClassifier 上针对多类分类问题执行交叉验证

import numpy as np
import pandas as pd
import xgboost as xgb
from xgboost.sklearn import XGBClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn import cross_validation, metrics
from sklearn.grid_search import GridSearchCV


def modelFit(alg, X, y, useTrainCV=True, cvFolds=5, early_stopping_rounds=50):
if useTrainCV:
xgbParams = alg.get_xgb_params()
xgTrain = xgb.DMatrix(X, label=y)
cvresult = xgb.cv(xgbParams,
xgTrain,
num_boost_round=alg.get_params()['n_estimators'],
nfold=cvFolds,
stratified=True,
metrics={'mlogloss'},
early_stopping_rounds=early_stopping_rounds,
seed=0,
callbacks=[xgb.callback.print_evaluation(show_stdv=False), xgb.callback.early_stop(3)])

print cvresult
alg.set_params(n_estimators=cvresult.shape[0])

# Fit the algorithm
alg.fit(X, y, eval_metric='mlogloss')

# Predict
dtrainPredictions = alg.predict(X)
dtrainPredProb = alg.predict_proba(X)

# Print model report:
print "\nModel Report"
print "Classification report: \n"
print(classification_report(y_val, y_val_pred))
print "Accuracy : %.4g" % metrics.accuracy_score(y, dtrainPredictions)
print "Log Loss Score (Train): %f" % metrics.log_loss(y, dtrainPredProb)
feat_imp = pd.Series(alg.booster().get_fscore()).sort_values(ascending=False)
feat_imp.plot(kind='bar', title='Feature Importances')
plt.ylabel('Feature Importance Score')


# 1) Read training set
print('>> Read training set')
train = pd.read_csv(trainFile)

# 2) Extract target attribute and convert to numeric
print('>> Preprocessing')
y_train = train['OutcomeType'].values
le_y = LabelEncoder()
y_train = le_y.fit_transform(y_train)
train.drop('OutcomeType', axis=1, inplace=True)

# 4) Extract features and target from training set
X_train = train.values

# 5) First classifier
xgb = XGBClassifier(learning_rate =0.1,
n_estimators=1000,
max_depth=5,
min_child_weight=1,
gamma=0,
subsample=0.8,
colsample_bytree=0.8,
scale_pos_weight=1,
objective='multi:softprob',
seed=27)

modelFit(xgb, X_train, y_train)

其中 y_train 包含从 0 到 4 的标签。但是,当我运行这段代码时,我从 xgb.cv 函数 xgboost.core 得到以下错误.XGBoostError:参数 num_class 的值 0 应大于等于 1。在 XGBoost 文档上,我读到在多类情况下 xgb 从目标向量中的标签推断类的数量,所以我不明白发生了什么。

最佳答案

您必须将参数“num_class”添加到 xgb_param 字典中。参数说明和您在上面提供的链接的评论中也提到了这一点。

关于python - XGBClassifier 的交叉验证,用于 python 中的多类分类,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37845920/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com