gpt4 book ai didi

r - 使用 gbm() 分类 - 错误

转载 作者:行者123 更新时间:2023-12-05 04:17:17 25 4
gpt4 key购买 nike

cancer <- read.csv('breast-cancer-wisconsin.data', header = FALSE, na.strings="?")
cancer <- cancer[complete.cases(cancer),]
names(cancer)[11] <- "class"
cancer[, 11] <- factor(cancer[, 11], labels = c("benign", "malignant"))
library(gbm)

首先,我使用 complete.cases 删除“NA”值,并将第十一列“类”作为因子。我想使用“类”作为响应变量和其他列,除了第一列,作为预测变量。

在我第一次尝试时,我输入了:

boost.cancer <- gbm(class ~ .-V1, data = cancer, distribution = "bernoulli") 

Error in gbm.fit(x, y, offset = offset, distribution = distribution, w = w, :
Bernoulli requires the response to be in {0,1}

然后,我用类(class)的对比代替类(class)。

boost.cancer <- gbm(contrasts(class) ~ .-V1, distribution = "bernoulli", data = cancer)

Error in model.frame.default(formula = contrasts(class) ~ . - V1, data = cancer, :
variable lengths differ (found for 'V1')

如何更正这些错误?我确定我的方法有问题。

最佳答案

如错误所述,您的响应不在 [0,1] 中。您可以这样做而不是创建因子:

> cancer$class <- (cancer$class -2)/2

> boost.cancer <- gbm(class ~ .-V1, data = cancer, distribution = "bernoulli")
> boost.cancer
gbm(formula = class ~ . - V1, distribution = "bernoulli", data = cancer)
A gradient boosted model with bernoulli loss function.
100 iterations were performed.
There were 9 predictors of which 4 had non-zero influence.

关于r - 使用 gbm() 分类 - 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23991903/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com