gpt4 book ai didi

R gbm 逻辑回归

转载 作者:行者123 更新时间:2023-12-04 05:41:51 29 4
gpt4 key购买 nike

我希望使用 GBM包进行逻辑回归,但它给出的答案略超出 0-1 范围。我已经尝试了 0-1 预测的建议分布参数( bernoulliadaboost ),但这实际上比使用 gaussian 更糟。 .

GBM_NTREES = 150
GBM_SHRINKAGE = 0.1
GBM_DEPTH = 4
GBM_MINOBS = 50
> GBM_model <- gbm.fit(
+ x = trainDescr
+ ,y = trainClass
+ ,distribution = "gaussian"
+ ,n.trees = GBM_NTREES
+ ,shrinkage = GBM_SHRINKAGE
+ ,interaction.depth = GBM_DEPTH
+ ,n.minobsinnode = GBM_MINOBS
+ ,verbose = TRUE)
Iter TrainDeviance ValidDeviance StepSize Improve
1 0.0603 nan 0.1000 0.0019
2 0.0588 nan 0.1000 0.0016
3 0.0575 nan 0.1000 0.0013
4 0.0563 nan 0.1000 0.0011
5 0.0553 nan 0.1000 0.0010
6 0.0546 nan 0.1000 0.0008
7 0.0539 nan 0.1000 0.0007
8 0.0533 nan 0.1000 0.0006
9 0.0528 nan 0.1000 0.0005
10 0.0524 nan 0.1000 0.0004
100 0.0484 nan 0.1000 0.0000
150 0.0481 nan 0.1000 -0.0000
> prediction <- predict.gbm(object = GBM_model
+ ,newdata = testDescr
+ ,GBM_NTREES)
> hist(prediction)
> range(prediction)
[1] -0.02945224 1.00706700

伯努利:
GBM_model <- gbm.fit(
x = trainDescr
,y = trainClass
,distribution = "bernoulli"
,n.trees = GBM_NTREES
,shrinkage = GBM_SHRINKAGE
,interaction.depth = GBM_DEPTH
,n.minobsinnode = GBM_MINOBS
,verbose = TRUE)
prediction <- predict.gbm(object = GBM_model
+ ,newdata = testDescr
+ ,GBM_NTREES)
> hist(prediction)
> range(prediction)
[1] -4.699324 3.043440

和 adaboost:
GBM_model <- gbm.fit(
x = trainDescr
,y = trainClass
,distribution = "adaboost"
,n.trees = GBM_NTREES
,shrinkage = GBM_SHRINKAGE
,interaction.depth = GBM_DEPTH
,n.minobsinnode = GBM_MINOBS
,verbose = TRUE)
> prediction <- predict.gbm(object = GBM_model
+ ,newdata = testDescr
+ ,GBM_NTREES)
> hist(prediction)
> range(prediction)
[1] -3.0374228 0.9323279

我做错了什么,我是否需要预处理(缩放,居中)数据,或者我是否需要进入并手动使用以下内容来限制/限制值:
prediction <- ifelse(prediction < 0, 0, prediction)
prediction <- ifelse(prediction > 1, 1, prediction)

最佳答案

来自 ?predict.gbm :

Returns a vector of predictions. By default the predictions are on the scale of f(x). For example, for the Bernoulli loss the returned value is on the log odds scale, poisson loss on the log scale, and coxph is on the log hazard scale.

If type="response" then gbm converts back to the same scale as the outcome. Currently the only effect this will have is returning probabilities for bernoulli and expected counts for poisson. For the other distributions "response" and "link" return the same.



所以如果你使用 distribution="bernoulli" ,您需要转换预测值以将它们重新缩放为 [0, 1]: p <- plogis(predict.gbm(model)) .使用 distribution="gaussian"实际上是用于回归而不是分类,尽管我很惊讶预测不在 [0, 1] 中:我的理解是 gbm 仍然基于树,因此预测值不应该超出训练数据中存在的值。

关于R gbm 逻辑回归,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8410846/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com