gpt4 book ai didi

r - 在R中第一次出现另一个条件之后,如何根据第二次出现的条件从列中获取值?

转载 作者:行者123 更新时间:2023-12-04 21:15:29 25 4
gpt4 key购买 nike

数据

以下是示例数据框:

> dput(df)
structure(list(Vehicle.ID = c(21L, 21L, 21L, 21L, 21L, 21L, 21L,
21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L,
45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L,
45L, 45L, 45L, 45L, 45L, 45L, 45L), gap.dist = c(36L, 37L, 38L,
39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L,
52L, 53L, 54L, 55L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L,
34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L), safept = c("no",
"no", "no", "no", "dx_safe+CC2", "no", "no", "no", "no", "dx_safe",
"no", "no", "no", "no", "no", "dx_safe+CC2", "no", "no", "dx_safe",
"no", "no", "no", "no", "no", "dx_safe+CC2", "no", "no", "no",
"no", "dx_safe", "no", "no", "no", "no", "no", "no", "no", "no",
"dx_safe", "no")), .Names = c("Vehicle.ID", "gap.dist", "safept"
), row.names = c(NA, -40L), class = "data.frame")

目标

我想创建 2 列。第一列是 safetylower其中应该包含 gap.dist 的值来自 Vehicle.ID第一次出现 "dx_safe"safept柱子。第二列是 safetyupper其中应包含:
  • gap.dist的值来自 Vehicle.ID在第一次出现"dx_safe+CC2"在第一次出现 dx_safe 之后(值(value)
    之前发现)。这适用于如果有任何发生"dx_safe+CC2"在第一次出现 "dx_safe" 之后.
  • gap.dist 的最后一个值对于给定的 Vehicle.ID

  • 因此,所需的输出类似于以下内容:

    期望输出
    > dput(df)
    structure(list(Vehicle.ID = c(21L, 21L, 21L, 21L, 21L, 21L, 21L,
    21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L,
    45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L,
    45L, 45L, 45L, 45L, 45L, 45L, 45L), gap.dist = c(36L, 37L, 38L,
    39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L,
    52L, 53L, 54L, 55L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L,
    34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L), safept = c("no",
    "no", "no", "no", "dx_safe+CC2", "no", "no", "no", "no", "dx_safe",
    "no", "no", "no", "no", "no", "dx_safe+CC2", "no", "no", "dx_safe",
    "no", "no", "no", "no", "no", "dx_safe+CC2", "no", "no", "no",
    "no", "dx_safe", "no", "no", "no", "no", "no", "no", "no", "no",
    "dx_safe", "no"), safetylower = c(45, 45, 45, 45, 45, 45, 45,
    45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 34, 34, 34,
    34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34,
    34), safetyupper = c(51, 51, 51, 51, 51, 51, 51, 51, 51, 51,
    51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 44, 44, 44, 44, 44, 44,
    44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44)), .Names = c("Vehicle.ID",
    "gap.dist", "safept", "safetylower", "safetyupper"), row.names = c(NA,
    -40L), class = "data.frame")

    我试过的

    我只能创建第一列 safetylower通过使用 match .下面显示了我尝试的没有达到目标的代码。请帮忙。
    library(plyr)
    df <- ddply(df, 'Vehicle.ID', transform,
    safetylower = gap.dist[match('dx_safe', safept)],
    safetyupper = gap.dist[match('dx_safe+CC2', safept)])

    编辑

    如果有不止一套怎么办 dx_safedx_safe+CC2 ?考虑以下数据框:
    df <- data.frame(Vehicle.ID=rep(c(5,6),each= 50), 
    gap.dist = rep(seq(from=10, to=59), 2),
    safept = rep(c(rep('no', 5), 'dx_safe+CC2', rep('no', 4), 'dx_safe', rep('no', 3), 'dx_safe+CC2', rep('no', 5), 'dx_safe', rep('no', 28), 'dx_safe+CC2'), 2))

    基于两个答案中提供的相同代码(它们都可以完美工作),我如何只考虑较长的集合(中间行数最长的集合)并获得 gap.dist safetylower 的值和 safetyupper (作者 Vehilce.ID s)?输出应该是:
    > dput(df)
    structure(list(Vehicle.ID = c(5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
    5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
    5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6,
    6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
    6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
    6, 6, 6, 6, 6), gap.dist = c(10L, 11L, 12L, 13L, 14L, 15L, 16L,
    17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L,
    30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L,
    43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L,
    56L, 57L, 58L, 59L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L,
    19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L,
    32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L,
    45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L,
    58L, 59L), safept = structure(c(3L, 3L, 3L, 3L, 3L, 2L, 3L, 3L,
    3L, 3L, 1L, 3L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 3L, 3L,
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 3L, 2L,
    3L, 3L, 3L, 3L, 1L, 3L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 3L, 1L, 3L,
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L), .Label = c("dx_safe",
    "dx_safe+CC2", "no"), class = "factor"), safetylower = c(30,
    30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30,
    30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30,
    30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30,
    30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30,
    30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30,
    30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30,
    30, 30, 30), safetyupper = c(59, 59, 59, 59, 59, 59, 59, 59,
    59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59,
    59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59,
    59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59,
    59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59,
    59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59,
    59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59)), .Names = c("Vehicle.ID",
    "gap.dist", "safept", "safetylower", "safetyupper"), row.names = c(NA,
    -100L), class = "data.frame")

    最佳答案

    使用 split() 的分而治之的方法怎么样?

    unsplit(lapply(split(df, df$Vehicle.ID), function(x) {
    lower <- which(x$safept=="dx_safe")[1]
    upper <- Filter(function(x) x>lower, which(x$safept=="dx_safe+CC2"))[1]
    if(is.na(upper)) {
    upper = nrow(x)
    }
    cbind(x, safetylower=x$gap.dist[lower], safetyupper=x$gap.dist[upper])
    }), df$Vehicle.ID)

    在这里,我们基本上为每个“Vehicle.ID”创建了一个 data.frame,然后我使用您的定义为“gap.dist”的每个值找到合适的行索引。最后,我将这些值添加回 data.frame 然后 unsplit()恢复订单的数据。

    关于r - 在R中第一次出现另一个条件之后,如何根据第二次出现的条件从列中获取值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24555435/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com