gpt4 book ai didi

r - 根据给定变量识别连续序列

转载 作者:行者123 更新时间:2023-12-02 11:18:08 24 4
gpt4 key购买 nike

我真的坚持这一点。 df1有以下变量:

  • serial = 一群人
  • id1 = 组中的人(例如 12 (serial) 1 (id1) =组 12 person 1; 12 2 = group 12 person 2, etc . )
  • 'Day '当第一次(或开始)录音时。

  • 天数由相同数量的观察组成(例如 95)
            day1 (Monday)  =  day11-day196 
    day2 (Tuesday) = day21-day296
    day3 (Wednesday) = day31-day396
    day4 (Thursday) = day41-day496
    day5 (Friday) = day51-day596
    day6 (Saturday) = day61-day696
    day7 (Sunday) = day71-day796

    df1 示例
    serial id1  Day     day1 day2 day3 day4 day5 day6 day7
    12 1 Monday 2 1 2 1 1 3 1
    123 1 Tuesday 0 3 0 3 3 0 3
    10 1 Wednesday 0 3 3 3 3 3 3

    我想确定连续记录(每日记录之间没有间隙)和记录总数。

    连续记录的开始日期是“Day”变量。例如,连续记录将是连续记录 12。记录从星期一开始,并且在一周中有记录(至少一个来自 95 变量)。在一周内(7 x 95 变量)有 11 条记录

    非连续记录将是 id 123,因为第 3 天和第 6 天存在间隔日。记录从周二开始,周三和周六有间隔。

    最后我想记录下连续录音的时长。

    示例输出:
     serial  id1   Duration Occurance        Days
    12 1 11 7 day1 day2 day3 day4 day5 day6 day7
    123 1 12 0 0
    10 1 18 5 day3 day4 day5 day6 day7

    样本数据
    structure(list(serial = c(12, 123, 10), id1 = c(1, 1, 1), Day = structure(1:3, .Label = c("Monday",
    "Tuesday", "Wednesday"), class = "factor"), day1 = c(2, 0, 0),
    day2 = c(1, 3, 3), day3 = c(2, 0, 3), day4 = c(1, 3, 3),
    day5 = c(1, 3, 3), day6 = c(3, 0, 3), day7 = c(1, 3, 3)), row.names = c(NA,
    3L), class = "data.frame")

    类似帖子 R - identify consecutive sequences

    最佳答案

    我们可以使用 rleid来自 data.table使“发生”正确

    library(data.table)
    wkdays <- c("Monday", "Tuesday", "Wednesday", "Thursday",
    "Friday", "Saturday", "Sunday")

    out1 <- do.call(rbind, Map(function(x, y) {
    i1 <- match(y, wkdays): length(x)
    i2 <- x[i1] != 0
    i3 <- all(i2)
    grp1 <- rleid(i2)
    Days <- if(i3) tapply(names(x)[i1][i2], grp1[i2], FUN = paste, collapse= ' ') else ''
    Occurance <- if(i3) length(grp1[i2]) else 0
    data.frame(Occurance, Days)
    }, asplit(df[-(1:3)], 1), df$Day))

    out1$Duration <- rowSums(df1[startsWith(names(df1), 'day')])
    out1
    # Occurance Days Duration
    #1 7 day1 day2 day3 day4 day5 day6 day7 11
    #2 0 12
    #3 5 day3 day4 day5 day6 day7 18

    关于r - 根据给定变量识别连续序列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61187493/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com