gpt4 book ai didi

python-2.7 - Sklearn GradientBoostingRegressor 中的早期停止

转载 作者:行者123 更新时间:2023-11-30 09:08:26 24 4
gpt4 key购买 nike

我正在使用已实现的监视器类 here

class Monitor():

"""Monitor for early stopping in Gradient Boosting for classification.

The monitor checks the validation loss between each training stage. When
too many successive stages have increased the loss, the monitor will return
true, stopping the training early.

Parameters
----------
X_valid : array-like, shape = [n_samples, n_features]
Training vectors, where n_samples is the number of samples
and n_features is the number of features.
y_valid : array-like, shape = [n_samples]
Target values (integers in classification, real numbers in
regression)
For classification, labels must correspond to classes.
max_consecutive_decreases : int, optional (default=5)
Early stopping criteria: when the number of consecutive iterations that
result in a worse performance on the validation set exceeds this value,
the training stops.
"""

def __init__(self, X_valid, y_valid, max_consecutive_decreases=5):
self.X_valid = X_valid
self.y_valid = y_valid
self.max_consecutive_decreases = max_consecutive_decreases
self.losses = []


def __call__(self, i, clf, args):
if i == 0:
self.consecutive_decreases_ = 0
self.predictions = clf._init_decision_function(self.X_valid)

predict_stage(clf.estimators_, i, self.X_valid, clf.learning_rate,
self.predictions)
self.losses.append(clf.loss_(self.y_valid, self.predictions))

if len(self.losses) >= 2 and self.losses[-1] > self.losses[-2]:
self.consecutive_decreases_ += 1
else:
self.consecutive_decreases_ = 0

if self.consecutive_decreases_ >= self.max_consecutive_decreases:
print("f"
"({}): s {}.".format(self.consecutive_decreases_, i)),
return True
else:
return False

params = { 'n_estimators': nEstimators,
'max_depth': maxDepth,
'min_samples_split': minSamplesSplit,
'min_samples_leaf': minSamplesLeaf,
'min_weight_fraction_leaf': minWeightFractionLeaf,
'min_impurity_decrease': minImpurityDecrease,
'learning_rate': 0.01,
'loss': 'quantile',
'alpha': alpha,
'verbose': 0
}
model = ensemble.GradientBoostingRegressor( **params )
model.fit( XTrain, yTrain, monitor = Monitor( XTest, yTest, 25 ) )

效果非常好。但是,我不清楚这条线是什么型号

model.fit( XTrain, yTrain, monitor = Monitor( XTest, yTest, 25 ) )

返回:

1) 无型号

2)停止前训练的模型

3)模型25次迭代之前(注意监视器的参数)

如果不是(3),是否可以使估计器返回3?

我该怎么做?

It is worth mentioning that xgboost library does that, however it does allow to use the loss function that I need.

最佳答案

模型在“停止规则”停止模型之前返回拟合 - 意味着您的答案 2 是正确的。

这个“监控代码”的问题在于,最终选择的模型将是包含 25 次额外迭代的模型。选择的模型应该是您的第三个答案。

我认为简单(且愚蠢)的方法是运行相同的模型(使用种子 - 具有相同的结果),但保持模型的迭代次数等于(i - max_consecutive_decreases)

关于python-2.7 - Sklearn GradientBoostingRegressor 中的早期停止,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46281012/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com