gpt4 book ai didi

r - 为什么用伪 Huber 损失训练 Xgboost 模型会返回一个恒定的测试指标?

转载 作者:行者123 更新时间:2023-12-04 17:20:05 24 4
gpt4 key购买 nike

我正在尝试使用原生伪 Huber 损失来拟合 xgboost 模型 reg:pseudohubererror .但是,它似乎不起作用,因为训练和测试错误都没有改善。它适用于 reg:squarederror .我错过了什么?
代码:

library(xgboost)
n = 1000
X = cbind(runif(n,10,20), runif(n,0,10))
y = X %*% c(2,3) + rnorm(n,0,1)

train = xgb.DMatrix(data = X[-n,],
label = y[-n])

test = xgb.DMatrix(data = t(as.matrix(X[n,])),
label = y[n])

watchlist = list(train = train, test = test)

xbg_test = xgb.train(data = train, objective = "reg:pseudohubererror", eval_metric = "mae", watchlist = watchlist, gamma = 1, eta = 0.01, nrounds = 10000, early_stopping_rounds = 100)
结果:
[1] train-mae:44.372692 test-mae:33.085709 
Multiple eval metrics are present. Will use test_mae for early stopping.
Will train until test_mae hasn't improved in 100 rounds.

[2] train-mae:44.372692 test-mae:33.085709
[3] train-mae:44.372688 test-mae:33.085709
[4] train-mae:44.372688 test-mae:33.085709
[5] train-mae:44.372688 test-mae:33.085709
[6] train-mae:44.372688 test-mae:33.085709
[7] train-mae:44.372688 test-mae:33.085709
[8] train-mae:44.372688 test-mae:33.085709
[9] train-mae:44.372688 test-mae:33.085709
[10] train-mae:44.372692 test-mae:33.085709

最佳答案

这似乎是pseudohuber 损失的预期行为。在这里,我对找到的目标损失函数的一阶和二阶导数进行了硬编码 here并通过 obj=obje 喂它范围。如果你运行它并与 objective="reg:pseudohubererror" 进行比较版本,你会看到它们是一样的。至于为什么它比平方损失差这么多,不确定。

set.seed(20)

obje=function(pred, dData) {
labels=getinfo(dData, "label")
a=pred
d=labels
fir=a^2/sqrt(a^2/d^2+1)/d-2*d*(sqrt(a^2/d^2+1)-1)
sec=((2*(a^2/d^2+1)^(3/2)-2)*d^2-3*a^2)/((a^2/d^2+1)^(3/2)*d^2)
return (list(grad=fir, hess=sec))
}

xbg_test = xgb.train(data = train, obj=obje, eval_metric = "mae", watchlist = watchlist, gamma = 1, eta = 0.01, nrounds = 10000, early_stopping_rounds = 100)

关于r - 为什么用伪 Huber 损失训练 Xgboost 模型会返回一个恒定的测试指标?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66696885/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com