gpt4 book ai didi

python - LightGBM 错误 : ValueError: For early stopping, 至少需要一个数据集和评估指标进行评估

转载 作者:行者123 更新时间:2023-12-04 10:03:46 26 4
gpt4 key购买 nike

我正在尝试使用 gridsearch 训练 LightGBM,当我尝试训练模型时出现以下错误。

ValueError: For early stopping, at least one dataset and eval metric is required for evaluation

我提供了验证数据集和评估指标。不知道为什么我仍然遇到这个问题。这是我的代码。
train_data  = rtotal[rtotal['train_Y'] == 1]
test_data = rtotal[rtotal['train_Y'] == 0]

trainData, validData = train_test_split(train_data, test_size=0.007, random_state = 123)

#train data prep
X_train = trainData.iloc[:,2:71]
y_train = trainData.loc[:,['a_class']]

#validation data prep
X_valid = validData.iloc[:,2:71]
y_valid = validData.loc[:,['a_class']]

#X_test
X_test = test_data.iloc[:,2:71]

import lightgbm as lgb
from sklearn.model_selection import GridSearchCV

gridParams = {
'learning_rate': [0.005],
'n_estimators': [40],
'num_leaves': [16,32, 64],
'objective' : ['multiclass'],
'random_state' : [501],
'num_boost_round' : [3000],
'colsample_bytree' : [0.65, 0.66],
'subsample' : [0.7,0.75],
'reg_alpha' : [1,1.2],
'reg_lambda' : [1,1.2,1.4],
}

lgb_estimator = lgb.LGBMClassifier(boosting_type = 'gbdt',
n_estimators=500,
objective = 'multiclass',
learning_rate = 0.05, num_leaves = 64,
eval_metric = 'multi_logloss',
verbose_eval=20,
eval_set = [X_valid, y_valid],
early_stopping_rounds=100)

g_lgbm = GridSearchCV(estimator=lgb_estimator, param_grid=gridParams, n_jobs = 3, cv= 3)

lgb_model = g_lgbm.fit(X=X_train, y=y_train)

最佳答案

从我在提供的代码中看到的,你有几个问题:

  • 您将分类定义为多类,但并非完全如此,因为您将输出定义为一列,我相信其中可能有多个标签。
  • 如果您想提前停止,您需要提供验证集,因为错误消息明确指出。你需要以一种合适的方法来做。

  • 如果您更正了这些错误的代码,它会很高兴地运行:
    gridParams = { 
    'learning_rate': [0.005],
    'n_estimators': [40],
    'num_leaves': [16,32, 64],
    'random_state' : [501],
    'num_boost_round' : [3000],
    'colsample_bytree' : [0.65, 0.66],
    'subsample' : [0.7,0.75],
    'reg_alpha' : [1,1.2],
    'reg_lambda' : [1,1.2,1.4],
    }

    lgb_estimator = lgb.LGBMClassifier(boosting_type = 'gbdt',
    n_estimators=500,
    learning_rate = 0.05, num_leaves = 64,
    eval_metric = 'logloss',
    verbose_eval=20,
    early_stopping_rounds=10)

    g_lgbm = GridSearchCV(estimator=lgb_estimator, param_grid=gridParams, n_jobs = 3, cv= 3)

    lgb_model = g_lgbm.fit(X=X_train, y=y_train, eval_set = (X_valid, y_valid))

    ...
    [370] valid_0's binary_logloss: 0.422895
    [371] valid_0's binary_logloss: 0.423064
    [372] valid_0's binary_logloss: 0.422681
    [373] valid_0's binary_logloss: 0.423206
    [374] valid_0's binary_logloss: 0.423142
    [375] valid_0's binary_logloss: 0.423414
    [376] valid_0's binary_logloss: 0.423338
    [377] valid_0's binary_logloss: 0.423864
    [378] valid_0's binary_logloss: 0.42381
    [379] valid_0's binary_logloss: 0.42409
    [380] valid_0's binary_logloss: 0.423476
    [381] valid_0's binary_logloss: 0.423759
    [382] valid_0's binary_logloss: 0.423804
    Early stopping, best iteration is:
    [372] valid_0's binary_logloss: 0.422681

    关于python - LightGBM 错误 : ValueError: For early stopping, 至少需要一个数据集和评估指标进行评估,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61694081/

    26 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com