gpt4 book ai didi

返回每组条件行的第一个值更改

转载 作者:行者123 更新时间:2023-12-02 01:26:16 25 4
gpt4 key购买 nike

我有以下名为 df 的数据框(下面的dput):

   group                date indicator value
1 A 2022-11-01 01:00:00 FALSE 2
2 A 2022-11-01 03:00:00 FALSE 1
3 A 2022-11-01 04:00:00 FALSE 2
4 A 2022-11-01 05:00:00 FALSE 1
5 A 2022-11-01 06:00:00 TRUE 1
6 A 2022-11-01 07:00:00 FALSE 1
7 A 2022-11-01 10:00:00 FALSE 2
8 A 2022-11-01 12:00:00 FALSE 1
9 B 2022-11-01 01:00:00 FALSE 1
10 B 2022-11-01 02:00:00 FALSE 2
11 B 2022-11-01 03:00:00 FALSE 1
12 B 2022-11-01 06:00:00 TRUE 1
13 B 2022-11-01 07:00:00 FALSE 1
14 B 2022-11-01 08:00:00 FALSE 1
15 B 2022-11-01 11:00:00 FALSE 2
16 B 2022-11-01 13:00:00 FALSE 2

我想找到相对于每组 indicator == TRUE 的行,第一行发生值更改的行。这意味着它应该找到 A 组的第 7 行和 B 组的第 15 行,因为它们都是在条件行之后且相对于条件行发生值更改的第一行。这是名为 df_desired 的所需输出:

  group                date indicator value
1 A 2022-11-01 06:00:00 TRUE 1
2 A 2022-11-01 10:00:00 FALSE 2
3 B 2022-11-01 06:00:00 TRUE 1
4 B 2022-11-01 11:00:00 FALSE 2

所以我想知道是否有人知道如何使用 df_desired 中的条件行找到所需的行?


此处 df 和 df_desired 的 dput:

df <- structure(list(group = c("A", "A", "A", "A", "A", "A", "A", "A", 
"B", "B", "B", "B", "B", "B", "B", "B"), date = structure(c(1667260800,
1667268000, 1667271600, 1667275200, 1667278800, 1667282400, 1667293200,
1667300400, 1667260800, 1667264400, 1667268000, 1667278800, 1667282400,
1667286000, 1667296800, 1667304000), class = c("POSIXct", "POSIXt"
), tzone = ""), indicator = c(FALSE, FALSE, FALSE, FALSE, TRUE,
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE,
FALSE, FALSE), value = c(2, 1, 2, 1, 1, 1, 2, 1, 1, 2, 1, 1,
1, 1, 2, 2)), row.names = c(NA, -16L), class = "data.frame")

df_desired <- structure(list(group = c("A", "A", "B", "B"), date = c("2022-11-01 06:00:00",
"2022-11-01 10:00:00", "2022-11-01 06:00:00", "2022-11-01 11:00:00"
), indicator = c(TRUE, FALSE, TRUE, FALSE), value = c(1, 2, 1,
2)), class = "data.frame", row.names = c(NA, -4L))

最佳答案

这是一种方法,

library(dplyr)

df %>%
group_by(group) %>%
mutate(val_diff = value - lag(value),
new = row_number()[indicator],
new1 = (val_diff == 1) & (row_number() > new)) %>%
filter(indicator|new1) %>%
select(-c(val_diff, new, new1))

# A tibble: 4 × 4
# Groups: group [2]
group date indicator value
<chr> <dttm> <lgl> <dbl>
1 A 2022-11-01 08:00:00 TRUE 1
2 A 2022-11-01 12:00:00 FALSE 2
3 B 2022-11-01 08:00:00 TRUE 1
4 B 2022-11-01 13:00:00 FALSE 2

OP dput 使用的数据

  group                date indicator value
1 A 2022-11-01 03:00:00 FALSE 2
2 A 2022-11-01 05:00:00 FALSE 1
3 A 2022-11-01 06:00:00 FALSE 2
4 A 2022-11-01 07:00:00 FALSE 1
5 A 2022-11-01 08:00:00 TRUE 1
6 A 2022-11-01 09:00:00 FALSE 1
7 A 2022-11-01 12:00:00 FALSE 2
8 A 2022-11-01 14:00:00 FALSE 1
9 B 2022-11-01 03:00:00 FALSE 1
10 B 2022-11-01 04:00:00 FALSE 2
11 B 2022-11-01 05:00:00 FALSE 1
12 B 2022-11-01 08:00:00 TRUE 1
13 B 2022-11-01 09:00:00 FALSE 1
14 B 2022-11-01 10:00:00 FALSE 1
15 B 2022-11-01 13:00:00 FALSE 2
16 B 2022-11-01 15:00:00 FALSE 2

关于返回每组条件行的第一个值更改,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74543752/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com