gpt4 book ai didi

r - 在 R 中难以按行操作纵向数据

转载 作者:行者123 更新时间:2023-12-04 10:12:57 24 4
gpt4 key购买 nike

我在处理纵向数据时遇到了一些麻烦:我的数据集由每行一个唯一 ID 组成,后跟一系列访问日期。每次访问都有 3 个二分变量的值。

data1 <- structure(list(V1date = structure(c(2L, 1L, 2L, 3L, 4L), .Label = c("1/22/12", "4/5/12", "8/18/12", "9/6/12"), class = "factor"), 
V1a = structure(c(1L, 1L, 2L, 1L, 2L), .Label = c("No", "Yes"), class = "factor"),
V1b = structure(c(2L, 1L, 1L, 1L, 1L), .Label = c("No", "Yes"), class = "factor"),
V1c = structure(c(1L, 2L, 1L, 1L, 1L), .Label = c("No", "Yes"), class = "factor"),
V2date = structure(c(1L, 2L, 4L, 3L, NA), .Label = c("6/18/12", "7/5/12", "9/22/12", "9/4/12"), class = "factor"),
V2a = structure(c(1L, 1L, 1L, 1L, NA), .Label = "Yes", class = "factor"),
V2b = structure(c(1L, 1L, 1L, 1L, NA), .Label = "No", class = "factor"),
V2c = structure(c(1L, 1L, 1L, 1L, NA), .Label = "Yes", class = "factor"),
V3date = structure(c(NA, NA, 1L, NA, 2L), .Label = c("11/1/12", "12/4/12"), class = "factor"),
V3a = structure(c(NA, NA, 1L, NA, 1L), .Label = "Yes", class = "factor"),
V3b = structure(c(NA, NA, 1L, NA, 1L), .Label = "No", class = "factor"),
V3c = structure(c(NA, NA, 2L, NA, 1L), .Label = c("No", "Yes"), class = "factor")),
.Names = c("V1date", "V1a", "V1b", "V1c", "V2date", "V2a", "V2b", "V2c", "V3date", "V3a", "V3b", "V3c"),
class = "data.frame", row.names = c("001", "002", "003", "004", "005"))

data1
V1date V1a V1b V1c V2date V2a V2b V2c V3date V3a V3b V3c
001 4/5/12 No Yes No 6/18/12 Yes No Yes <NA> <NA> <NA> <NA>
002 1/22/12 No No Yes 7/5/12 Yes No Yes <NA> <NA> <NA> <NA>
003 4/5/12 Yes No No 9/4/12 Yes No Yes 11/1/12 Yes No Yes
004 8/18/12 No No No 9/22/12 Yes No Yes <NA> <NA> <NA> <NA>
005 9/6/12 Yes No No <NA> <NA> <NA> <NA> 12/4/12 Yes No No

在三个变量的 8 种不同可能组合中,4 种是“异常”,其余 4 种是“正常”。每个人都开始不正常,然后要么在随后的访问中继续异常,要么在以后的访问中解决为正常模式(我忽略恢复到异常 - 一旦它们正常,它们就是正常的)

我最终必须在数据框的右侧添加 4 个新列,指示 1) 上次完成访问的日期(不管中间的“NA”如何),2) ID 最终是解决了还是保持异常,3 ) 如果解决,解决模式是什么以及 4) 解决日期是什么。 NA 总是以 4 个为一组出现(即没有访问日期,并且没有 3 个变量的值)并且被忽略。

例如,如果“yes-yes-no”、“yes-no-yes”、“no-yes-yes”和“yes-yes-yes”的模式是正常的,其余模式都是正常的,结果将是另外四个列,如下所示;

data2 <- structure(list(
LastVisDate = structure(c(3L, 2L, 3L, 3L, 2L), .Label = c("6/18/12", "12/4/12", "11/1/12", "9/22/12"), class = "factor"),
Resolved = structure(c(2L, 2L, 2L, 2L, 1L), .Label = c("No", "Yes"), class = "factor"),
Pattern = structure(c(1L, 1L, 1L, 1L, NA), .Label = "yny", class = "factor"),
Resdate = structure(c(1L, 2L, 3L, 4L, NA), .Label = c("6/18/12", "7/5/12", "9/4/12", "9/22/12"), class = "factor")),
.Names = c("LastVisDate", "Resolved", "Pattern", "Resdate"),
class = "data.frame", row.names = c("001", "002", "003", "004", "005"))

data2
LastVisDate Resolved Pattern Resdate
001 11/1/12 Yes yny 6/18/12
002 12/4/12 Yes yny 7/5/12
003 11/1/12 Yes yny 9/4/12
004 11/1/12 Yes yny 9/22/12
005 12/4/12 No <NA> <NA>

我在这个项目上花了很多时间,但无法弄清楚如何让 R 在数据集中向右行进,直到我的停止规则得到满足。非常感谢您的建议。

最佳答案

这取决于您的数据结构。特别是,从第 2、6 和 10 列开始有三个值,它们被传递给确定某人是否“正常”的函数。

这是一个判断某人是否“正常”的函数。还有其他的写法。

is.normal <- function(x) {
any(c(
all(x == c("Yes", "Yes", "No")),
all(x == c("Yes", "No", "Yes")),
all(x == c("No", "Yes", "Yes")),
all(x == c("Yes", "Yes", "Yes"))
))
}

我们使用它,应用于适当的列集。这取决于您在问题中指定的确切布局。注意传递给 vapply 的列号。这里的结果是一个逻辑矩阵,告诉某人在每一步是否“正常”。

ok <- vapply(c(2,6,10),
function(x) apply(data1[x:(x+2)], 1, is.normal ),
logical(length(data1[,1])))

> ok
[,1] [,2] [,3]
001 FALSE TRUE NA
002 FALSE TRUE NA
003 FALSE TRUE TRUE
004 FALSE TRUE NA
005 FALSE NA FALSE

现在找出每个人第一次变得“正常”的时间(如果有的话)。通过检查,除了最后一个仍然异常的人之外,每个人都是 2。 if 用于防止 Inf 在未达到常态时从 min 返回值。

date.ind <- apply(ok, 1,
function(x) {
y <- which(x)
if (length(y)) min(y) else NA
}
)

> date.ind
001 002 003 004 005
2 2 2 2 NA

然后我们可以提取日期,从上面知道“组”,以及如何到达实现常态的实际日期列:

dates <- vapply(seq_along(date.ind), 
function(x) if (is.na(date.ind[x])) as.character(NA) else as.character(data1[x,date.ind[x]*4-3]),
character(1)
)
> dates
[1] "6/18/12" "7/5/12" "9/4/12" "9/22/12" NA

提取其他信息是类似的,因为列索引可以像上面那样计算。

关于r - 在 R 中难以按行操作纵向数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13894686/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com