gpt4 book ai didi

R:如何从 for 循环而不是索引输出因子水平?

转载 作者:行者123 更新时间:2023-12-03 20:27:49 24 4
gpt4 key购买 nike

我有一个数据框,我正在对其运行蒙特卡罗模拟,使用 for 循环生成模拟分布。当我测试模拟代码时,我只是访问数据框中的第一个观察值:

Male.MC <-c()
for (j in 1:100){
for (i in 1:1) {
# u2 <- Male.DistF$Male.stddev_u2[i] * rnorm(1, mean = 0, sd = 1)
u2 <- Male.DistF$RndmEffct[i] * rnorm(1, mean = 0, sd = 1)
mc_bca <- Male.DistF$lmefits[i] + u2
temp <- Lambda.Value*mc_bca+1
ginv_a <- temp^(1/Lambda.Value)
d2ginv_a <- max(0,(1-Lambda.Value)*temp^(1/Lambda.Value-2))
mc_amount <- ginv_a + d2ginv_a * Male.DistF$Male.var[i]^2 / 2
z <- c(RespondentID <- Male.DistF$RespondentID[i],
Male.DistF$AgeFactor[i], Male.DistF$SampleWeight[i],
Male.DistF$Male.var[i], Male.DistF$lmefits[i], u2, mc_amount)
Male.MC <- as.data.frame(rbind(Male.MC,z))
}
}
colnames(Male.MC) <- c("RespondentID", "AgeFactor",
"SampleWeight", "VarByAge",
"lmefits", "u2", "mc_amount")

除了 Male.DistF$RespondentID 是一个因素之外,代码工作得很好,我没有得到因素水平输出,而是得到因素索引,在这种情况下我得到 1 因为 RespondentIDMale.DistF 数据框中按升序排列。 AgeFactor 也有同样的问题,我在其中获取索引而不是因子级别。

head(Male.MC)
RespondentID AgeFactor SampleWeight VarByAge lmefits u2 mc_amount
z 1 3 0.4952835 0.4189871 15.22634 0.2334501 11582.681
2 1 3 0.4952835 0.4189871 15.22634 0.3205741 11984.220
3 1 3 0.4952835 0.4189871 15.22634 -0.5674165 8420.678
4 1 3 0.4952835 0.4189871 15.22634 -0.5426489 8505.421
5 1 3 0.4952835 0.4189871 15.22634 0.4878695 12790.565
6 1 3 0.4952835 0.4189871 15.22634 0.1556925 11234.583

如何使 `Male.MC1 数据框包含这两个变量的因子水平?我试过:

z <- c(RespondentID <- as.character(Male.DistF$RespondentID[i]), 
Male.DistF$AgeFactor[i], Male.DistF$SampleWeight[i],
Male.DistF$Male.var[i], Male.DistF$lmefits[i], u2, mc_amount)

z <- c((as.character(Male.DistF$RespondentID[i])), 
Male.DistF$AgeFactor[i], Male.DistF$SampleWeight[i],
Male.DistF$Male.var[i], Male.DistF$lmefits[i], u2, mc_amount)

修复 RespondentID 输出,但我在语法上做错了,它试图将所有输出转换为因子:

There were 50 or more warnings (use warnings() to see the first 50)
str(Male.MC)
'data.frame': 100 obs. of 7 variables:
$ RespondentID: Factor w/ 1 level "100020": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr "z" "" "" "" ...
$ AgeFactor : Factor w/ 1 level "3": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr "z" "" "" "" ...
$ SampleWeight: Factor w/ 1 level "0.495283471": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr "z" "" "" "" ...
$ VarByAge : Factor w/ 1 level "0.418987052181831": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr "z" "" "" "" ...
$ lmefits : Factor w/ 1 level "15.2263403968895": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr "z" "" "" "" ...
$ u2 : Factor w/ 1 level "-0.100954008424162": 1 NA NA NA NA NA NA NA NA NA ...
..- attr(*, "names")= chr "z" "" "" "" ...
$ mc_amount : Factor w/ 1 level "10151.4582133747": 1 NA NA NA NA NA NA NA NA NA ...
..- attr(*, "names")= chr "z" "" "" "" ...

为了测试,这里是输入数据框 Male.DistF 的前几行:

     AgeFactor RespondentID SampleWeight IntakeAmt   RndmEffct NutrientID Gender Age BodyWeight  IntakeDay BoxCoxXY  lmefits      lmeres   TotWts   GrpWts NumSubjects TotSubjects  Male.var
1725 9to13 100020 0.4952835 12145.852 0.30288536 267 1 12 51.6 Day1Intake 15.61196 15.22634 0.27138449 2291.827 763.0604 525 2249 0.4189871
203 14to18 100419 0.3632839 9591.953 0.02703093 267 1 14 46.3 Day1Intake 15.01444 15.31373 -0.18039624 2291.827 472.3106 561 2249 0.3365423

Lambda.Value0.1Male.DistF 上的信息是:

str(Male.DistF)
'data.frame': 2249 obs. of 18 variables:
$ AgeFactor : Ord.factor w/ 4 levels "1to3"<"4to8"<..: 3 4 3 4 2 2 3 1 1 3 ...
$ RespondentID: Factor w/ 2249 levels "100020","100419",..: 1 2 3 4 5 6 7 8 9 10 ...
$ SampleWeight: num 0.495 0.363 0.495 1.326 2.12 ...
$ IntakeAmt : num 12146 9592 7839 11113 7150 ...
$ RndmEffct : num 0.3029 0.027 0.0772 0.4667 -0.1593 ...
$ NutrientID : int 267 267 267 267 267 267 267 267 267 267 ...
$ Gender : int 1 1 1 1 1 1 1 1 1 1 ...
$ Age : int 12 14 11 15 6 5 10 2 2 9 ...
$ BodyWeight : num 51.6 46.3 46.1 63.2 28.4 18 38.2 14.4 14.6 32.1 ...
$ IntakeDay : Factor w/ 2 levels "Day1Intake","Day2Intake": 1 1 1 1 1 1 1 1 1 1 ...
$ BoxCoxXY : num 15.6 15 14.5 15.4 14.3 ...
$ lmefits : num 15.2 15.3 15 15.8 14.3 ...
$ lmeres : num 0.271 -0.18 -0.342 -0.424 -0.053 ...
$ TotWts : num 2292 2292 2292 2292 2292 ...
$ GrpWts : num 763 472 763 472 779 ...
$ NumSubjects : int 525 561 525 561 613 613 525 550 550 525 ...
$ TotSubjects : int 2249 2249 2249 2249 2249 2249 2249 2249 2249 2249 ...
$ Male.var : num 0.419 0.337 0.419 0.337 0.267 ...

正如您从我的 Male.DistF 数据中看到的那样,对于第一次观察的 100 次重复,在 Male.MC 数据框中我想要 100020 作为 RespondentID(而不是 1),9to13 作为 AgeFactor(而不是 3)。我的输出指令哪里出错了,我该如何解决?特别是,我没有理解为什么我尝试使用 as.character 会严重误入歧途以致于影响整个输出。顺便说一句,我也欢迎提出加速循环的建议。我所做的只是为我的 Male.DistF 数据框中的每个观察构建 100 组值。

最佳答案

你可以试试换行

z <- c(...

将新行创建为向量,即,强制所有元素具有相同的类型,使用 1 行 data.frame,以保持列的类型。

z <- data.frame(
RespondentID = Male.DistF$RespondentID[i],
AgeFactor = Male.DistF$AgeFactor[i],
SampleWeight = Male.DistF$SampleWeight[i],
VarByAge = Male.DistF$Male.var[i],
lmefits = Male.DistF$lmefits[i],
u2 = u2,
mc_amount = mc_amount
)

关于R:如何从 for 循环而不是索引输出因子水平?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8774515/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com