gpt4 book ai didi

r - 在 randomForest 包中忽略 nodesize 参数

转载 作者:行者123 更新时间:2023-12-01 06:26:08 26 4
gpt4 key购买 nike

是否randomForest包忽略 nodesize范围?当我预测数据集的终端节点并检查计数时,我看到小于 nodesize 的值。 .我会自己提交一个修复程序,但底层代码是用 Fortran 编写的。如果有人可以确认此行为,我将联系包维护者并希望开始修复。

> library(randomForest)
> set.seed(1)
> rf <- randomForest(mtcars[,-1], mtcars[,1], nodesize = 5)
> nodes <- attr(predict(rf, mtcars[,-1], nodes = TRUE), 'nodes')

# node counts of first tree
> table(nodes[,1])

# first row is the terminal node ID#, second row is the count
2 6 9 10 11 14 15 16 18 19
5 3 3 6 4 2 3 1 3 2

添加系统信息:
Session info----------------------------------------------------------------
setting value
version R version 3.1.1 (2014-07-10)
system x86_64, mingw32
ui RStudio (0.98.1049)
language (EN)
collate English_United States.1252
tz America/Chicago

Packages--------------------------------------------------------------------
package * version date source
randomForest * 4.6.10 2014-07-17 CRAN (R 3.1.1)

最佳答案

包维护者的回应:

That parameter behaves as the way that Leo Breiman intended. The bug is in how the parameter was described. It’s the same as minsplit in the rpart:::rpart.control() function:

the minimum number of observations that must exist in a node in order for a split to be attempted.



下个版本我会把帮助文件里的描述改成
解决这个困惑。

最好的,安迪

关于r - 在 randomForest 包中忽略 nodesize 参数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28417826/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com