gpt4 book ai didi

r - 重复 ID 的基线变化

转载 作者:行者123 更新时间:2023-12-04 10:45:12 26 4
gpt4 key购买 nike

例如,

> set.seed(1)
df1 <- data.frame(ID = c(rep(c(rep(1,3), rep(2,3)),2),rep(c(rep(3,3), rep(4,3)),2)),
Day=rep(c(1,2,3),8))
df2 <- data.frame(measure = c(rep("mean",6),rep("median",6),rep("mean",6),rep("median",6)),
val=sample(1:24,24))

data <- cbind(df1,df2)

> data

ID Day measure val
1 1 1 mean 7
2 1 2 mean 9
3 1 3 mean 13
4 2 1 mean 20
5 2 2 mean 5
6 2 3 mean 18
7 1 1 median 19
8 1 2 median 12
9 1 3 median 11
10 2 1 median 1
11 2 2 median 3
12 2 3 median 14
13 3 1 mean 23
14 3 2 mean 21
15 3 3 mean 8
16 4 1 mean 16
17 4 2 mean 6
18 4 3 mean 24
19 3 1 median 22
20 3 2 median 4
21 3 3 median 17
22 4 1 median 15
23 4 2 median 2
24 4 3 median 10

我想创建另一个变量来衡量每个 ID 中每个度量从第 1 天开始的变化

    ID Day measure val change
1 1 1 mean 7 0
2 1 2 mean 9 2
3 1 3 mean 13 6
4 2 1 mean 20 0
5 2 2 mean 5 -15
6 2 3 mean 18 -2
7 1 1 median 19 0
8 1 2 median 12 -7
9 1 3 median 11 -8
10 2 1 median 1 0
11 2 2 median 3 2
12 2 3 median 14 13
13 3 1 mean 23 0
14 3 2 mean 21 -2
15 3 3 mean 8 -15
16 4 1 mean 16 0
17 4 2 mean 6 -10
18 4 3 mean 24 8
19 3 1 median 22 0
20 3 2 median 4 -18
21 3 3 median 17 -5
22 4 1 median 15 0
23 4 2 median 2 -13
24 4 3 median 10 -5

我一直在尝试修改 Calculating change from baseline with data in long format 中的代码但是我的数据集中有重复的测量。

最佳答案

我们可以使用 data.table 来创建“更改”列。将'data.frame'转换为'data.table'(setDT(data)),按'ID','measure'分组,我们计算'val'和'val'之间的差异对应于“第 1 天”以创建“更改”。

library(data.table)
setDT(data)[, change:= val-val[Day==1L], by = .(ID, measure)]
data
# ID Day measure val change
# 1: 1 1 mean 7 0
# 2: 1 2 mean 9 2
# 3: 1 3 mean 13 6
# 4: 2 1 mean 20 0
# 5: 2 2 mean 5 -15
# 6: 2 3 mean 18 -2
# 7: 1 1 median 19 0
# 8: 1 2 median 12 -7
# 9: 1 3 median 11 -8
#10: 2 1 median 1 0
#11: 2 2 median 3 2
#12: 2 3 median 14 13
#13: 3 1 mean 23 0
#14: 3 2 mean 21 -2
#15: 3 3 mean 8 -15
#16: 4 1 mean 16 0
#17: 4 2 mean 6 -10
#18: 4 3 mean 24 8
#19: 3 1 median 22 0
#20: 3 2 median 4 -18
#21: 3 3 median 17 -5
#22: 4 1 median 15 0
#23: 4 2 median 2 -13
#24: 4 3 median 10 -5

使用 dplyr 的类似选项是

library(dplyr)
data %>%
group_by(ID, measure) %>%
mutate(change = val- val[Day==1L])

或者如果 'Day' 列已排序,则带有 avebase R 选项

 data$change <- with(data, val-ave(val, ID, measure, FUN=function(x) head(x,1)))

或者另一个 base R 选项,如果列是有序的则不分组

 data$change <- with(data, {i <- Day==1L; val-(val*i)[val*i>0][cumsum(i)] }) 

关于r - 重复 ID 的基线变化,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31619437/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com