gpt4 book ai didi

r - Lasso 回归,生成系数矩阵

转载 作者:行者123 更新时间:2023-11-30 08:53:47 24 4
gpt4 key购买 nike

我有这个套索回归代码,当我打印 beta 系数时,我得到许多组 beta 值,而不仅仅是一组。我没有为 lambda 指定任何值,但当我这样做时,我只得到一组 beta 值。我知道如何找到 lambda 的最佳值。我的问题是,当我没有指定 lambda 时,为什么会得到这么多组 beta?这些 beta 是连续变量吗?

Here is the code: 
library(MASS)
library(glmnet)
Boston=na.omit(Boston)
x=model.matrix(crim~.,Boston)[,-1]
y=as.matrix(Boston$crim)
lasso.mod =glmnet(x,y, alpha =1)
beta=coef(lasso.mod)

当我打印 beta 时,我得到了这些 beta(这里我只展示了一些 beta,因为它是维度为 14x77 的 beta 矩阵):

(Intercept) 3.613524 3.0893231 2.6116912 2.176491 1.7799525 1.4186414 1.0894283 0.7894616 0.5161430 0.10644553
zn . . . . . . . . . .
indus . . . . . . . . . .
chas . . . . . . . . . .
nox . . . . . . . . . .
rm . . . . . . . . . .
age . . . . . . . . . .
dis . . . . . . . . . .
rad . 0.0548935 0.1049104 0.150484 0.1920089 0.2298449 0.2643196 0.2957317 0.3243532 0.34314278
tax . . . . . . . . . .
ptratio . . . . . . . . . .
black . . . . . . . . . .
lstat . . . . . . . . . 0.01819859
medv . . . . . . . . . .

(Intercept) -0.29224457 -0.65554971 -0.98654448 -1.2881346 -1.551777e+00 -1.3115723669 -1.023961164 -0.760703960
zn . . . . . . . .
indus . . . . . . . .
chas . . . . . . . .
nox . . . . . . . .
rm . . . . . . . .
age . . . . . . . .
dis . . . . . . . .
rad 0.35910506 0.37366600 0.38691580 0.3989885 4.099887e-01 0.4167185339 0.423004227 0.428776109
tax . . . . . . . .
ptratio . . . . . . . .
black . . . . -2.682197e-05 -0.0008416848 -0.001560914 -0.002216123
lstat 0.03766106 0.05538458 0.07154406 0.0862680 9.955836e-02 0.1059656295 0.109649285 0.112926619
medv . . . . . -0.0042117132 -0.010323462 -0.015921859

(Intercept) -0.520830886 -0.302267470 -0.105253730 0.076376939 0.241885979 0.392691730 0.627291211 0.864528799
zn . . . . . . . .
indus . . . . . . . .
chas . . . . . . . .
nox . . . . . . . .
rm . . . . . . . .
age . . . . . . . .
dis . . . . . . -0.013081595 -0.027872125
rad 0.434035445 0.438827556 0.443126465 0.447110129 0.450740716 0.454048777 0.456008475 0.457602313
tax . . . . . . . .
ptratio . . . . . . . .
black -0.002813124 -0.003357088 -0.003852845 -0.004304448 -0.004715927 -0.005090852 -0.005417568 -0.005712667
lstat 0.115912528 0.118633177 0.121243578 0.123491798 0.125539133 0.127404580 0.127149522 0.126384081
medv -0.021022995 -0.025670960 -0.029854724 -0.033717719 -0.037237917 -0.040445393 -0.044169297 -0.047781258

(Intercept) 1.079254571 1.274889342 1.453144352 1.612076416 1.760570e+00 2.050760033 2.322171047
zn . . . . 1.285559e-05 0.004410478 0.008432185
indus . . . . . . .
chas . . . . . . -0.037708532
nox . . . . . . .
rm . . . . . . .
age . . . . . . .
dis -0.041210398 -0.053363055 -0.064436101 -0.074320056 -8.362340e-02 -0.125411018 -0.164769794
rad 0.459071435 0.460409659 0.461628996 0.462694814 4.637089e-01 0.463773651 0.463944084
tax . . . . . . .
ptratio . . . . . . .
black -0.005981441 -0.006226339 -0.006449481 -0.006653008 -6.838219e-03 -0.006939683 -0.007027304
lstat 0.125716380 0.125108816 0.124555233 0.124200710 1.237351e-01 0.121952792 0.119996733
medv -0.051057488 -0.054042369 -0.056762080 -0.059181727 -6.144805e-02 -0.066926875 -0.071842416

(Intercept) 2.549278186 2.762617045 2.952609432 3.1338778318 3.538132302 4.130162117 5.451162673 6.64884950
zn 0.012036563 0.015337882 0.018333706 0.0210755780 0.023426675 0.025313462 0.026747451 0.02813502
indus . . . -0.0001924235 -0.013776751 -0.025787090 -0.032420857 -0.03810011
chas -0.124943128 -0.204109888 -0.276274870 -0.3416707320 -0.380135159 -0.421664111 -0.443464956 -0.46402519
nox . . . . . -0.003479190 -0.917150073 -1.74619114
rm . . . . . . . .
age . . . . . . . .
dis -0.200411673 -0.233281676 -0.262884152 -0.2906434425 -0.335943512 -0.375943571 -0.430590855 -0.48105322
rad 0.463938072 0.464006464 0.464034620 0.4641320102 0.466554241 0.469838965 0.476096780 0.48167304
tax . . . . . . . .
ptratio . . . . . -0.011535857 -0.035347459 -0.05667073
black -0.007105454 -0.007175524 -0.007239875 -0.0072986549 -0.007367437 -0.007404653 -0.007449503 -0.00749179
lstat 0.118971687 0.117768685 0.116835186 0.1158446610 0.116189895 0.115858916 0.116384542 0.11661762
medv -0.075653219 -0.079251279 -0.082455778 -0.0854666531 -0.090263034 -0.095771213 -0.102812853 -0.10924776

最佳答案

主要答案在?glmnet中给出:

lambda

A user supplied lambda sequence. Typical usage is to have the program compute its own lambda sequence based on nlambda and lambda.min.ratio. Supplying a value of lambda overrides this. WARNING: use with care. Avoid supplying a single value for lambda (for predictions after CV use predict() instead). Supply instead a decreasing sequence of lambda values. glmnet relies on its warms starts for speed, and its often faster to fit a whole path than compute a single fit.

现在,默认情况下,nlambda 是 100 而不是 77。最小的 lambda 由下式给出

lambda.min.ratio = ifelse(nobs<nvars,0.01,0.0001)

而最大的则所有系数都为零。最后,在

lasso.mod
# Call: glmnet(x = x, y = y, alpha = 1)
#
# Df %Dev Lambda
# [1,] 0 0.00000 5.375000
# [2,] 1 0.06643 4.897000
# [3,] 1 0.12160 4.462000
# [4,] 1 0.16740 4.066000
# .....
# [73,] 13 0.45400 0.006627
# [74,] 13 0.45400 0.006038
# [75,] 13 0.45400 0.005501
# [76,] 13 0.45400 0.005013
# [77,] 13 0.45400 0.004567

我们看到百分比偏差似乎不再改变。因此,出于这个原因,lambda 序列会提前终止,而不会达到 100 个值。

关于r - Lasso 回归,生成系数矩阵,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49018804/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com