gpt4 book ai didi

r - enet() 有效,但通过 caret::train() 运行时无效

转载 作者:行者123 更新时间:2023-11-30 08:25:57 24 4
gpt4 key购买 nike

我正在尝试运行弹性网。从 LASSO 开始,然后从那里开始。我可以让它直接运行,但当我尝试使用 caret 包中的 train 运行相同的参数时,它会失败。我想让 train 正常工作,以便我可以用它来评估模型参数。

# Works
test <- enet( x=x, y=y, lambda=0, trace=TRUE, normalize=FALSE, intercept=FALSE )
# Doesn't
enetGrid <- data.frame(.lambda=0,.fraction=c(.01,.001,.0005,.0001))
ctrl <- trainControl( method="repeatedcv", repeats=5 )
> test2 <- train( x, y, method="enet", tuneGrid=enetGrid, trControl=ctrl, preProc=NULL )
fraction lambda RMSE Rsquared RMSESD RsquaredSD
1 1e-04 0 NaN NaN NA NA
2 5e-04 0 NaN NaN NA NA
3 1e-03 0 NaN NaN NA NA
4 1e-02 0 NaN NaN NA NA
Error in train.default(x, y, method = "enet", tuneGrid = enetGrid, trControl = ctrl, :
final tuning parameters could not be determined
In addition: There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
...
50: In eval(expr, envir, enclos) :
model fit failed for Fold10.Rep5: lambda=0, fraction=0.01 Error in enet(as.matrix(trainX), trainY, lambda = lmbda) :
Some of the columns of x have zero variance

请注意,上述示例中的任何共线性都只是对可重现示例进行子集化的结果(1,000 行与真实数据集中的 208,000 行)。

我已经通过各种方式检查了完整的数据集,包括 findLinearCombos。请注意,数百个变量是从临床诊断中虚拟出来的,因此是二元变量,其中 1 的比例较低。

如何使用与 enet()` 完全相同的设置来运行 train(...,method="enet")?



再现性、 session 信息等数据



示例数据xyavailable here .



sessionInfo()的结果:



R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=C LC_PAPER=C
[8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C

attached base packages:
[1] parallel splines grid stats graphics grDevices utils datasets methods base

other attached packages:
[1] scales_0.2.3 elasticnet_1.1 fscaret_0.8.5.3 gsubfn_0.6-5 proto_0.3-10 lars_1.2 caret_5.17-7
[8] foreach_1.4.1 cluster_1.14.4 lubridate_1.3.0 HH_2.3-37 reshape_0.8.4 latticeExtra_0.6-24 leaps_2.9
[15] multcomp_1.2-18 perturb_2.05 Zelig_4.2-0 sandwich_2.2-10 zoo_1.7-10 survey_3.29-5 Hmisc_3.12-2
[22] survival_2.37-4 lme4_0.999999-2 bayesm_2.2-5 stargazer_4.0 pscl_1.04.4 vcd_1.2-13 colorspace_1.2-2
[29] mvtnorm_0.9-9995 car_2.0-18 nnet_7.3-7 gdata_2.13.2 gtools_3.0.0 spBayes_0.3-7 Formula_1.1-1
[36] magic_1.5-4 abind_1.4-0 MapGAM_0.6-2 gam_1.08 fields_6.7.6 maps_2.3-2 spam_0.29-3
[43] FNN_1.0 spatstat_1.31-3 mgcv_1.7-24 rgeos_0.2-19 RArcInfo_0.4-12 automap_1.0-12 gstat_1.0-16
[50] SDMTools_1.1-13 rgdal_0.8-10 spdep_0.5-60 coda_0.16-1 deldir_0.0-22 maptools_0.8-25 nlme_3.1-110
[57] MASS_7.3-27 Matrix_1.0-12 lattice_0.20-15 boot_1.3-9 data.table_1.8.8 xtable_1.7-1 RCurl_1.95-4.1
[64] bitops_1.0-5 RColorBrewer_1.0-5 testthat_0.7.1 codetools_0.2-8 devtools_1.3 stringr_0.6.2 foreign_0.8-54
[71] ggplot2_0.9.3.1 sp_1.0-11 taRifx_1.0.5 reshape2_1.2.2 plyr_1.8 functional_0.4 R.utils_1.25.2
[78] R.oo_1.13.9 R.methodsS3_1.4.4

loaded via a namespace (and not attached):
[1] LearnBayes_2.12 compiler_3.0.1 dichromat_2.0-0 digest_0.6.3 evaluate_0.4.4 gtable_0.1.2 httr_0.2 intervals_0.14.0 iterators_1.0.6
[10] labeling_0.2 memoise_0.1 munsell_0.4.2 rpart_4.1-1 spacetime_1.0-5 stats4_3.0.1 tcltk_3.0.1 tools_3.0.1 whisker_0.3-2
[19] xts_0.9-5

更新

在数据集的 15% 样本上运行:

Warning in eval(expr, envir, enclos) :
model fit failed for Fold10.Rep1: lambda=0, fraction=0.005
... (more of the same warning messages) ...
Warning in nominalTrainWorkflow(dat = trainData, info = trainInfo, method = met\
hod, :
There were missing values in resampled performance measures.
Error in if (lambda > 0) { : argument is of length zero
Calls: train ... train.default -> system.time -> createModel -> enet

X 矩阵有 806 列,其中 801 列为虚拟列。其中许多虚拟变量都极其稀疏(大约 25k 行中有 1-3 个观察值),其他变量的值为 TRUE 的 0.1-5%。总共有 108867 个 TRUE 和 21mm FALSE。

最佳答案

只是为了解决这个问题,我现在已经可以使用了。我删除了所有少于 20 个 TRUE 的列(请记住,这是从近 200k 观察中得出的),因为没有足够的信息可供贡献。这大约是其中的一半。

当我前进时,我必须小心这些稀疏列不会产生太多偏差等,但我希望通过使用一种进行变量选择的方法(套索、RF 等)问题就会减少。

感谢@O_Devinyak 的帮助。

关于r - enet() 有效,但通过 caret::train() 运行时无效,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19122617/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com