gpt4 book ai didi

r - R 中具有多列的 pivot_longer

转载 作者:行者123 更新时间:2023-12-05 03:16:50 29 4
gpt4 key购买 nike

上下文

我想将df(宽格式)更改为df_expected(长格式)但失败了。

我知道我需要使用 pivot_longer,但是我需要转换多个列。

> df
# A tibble: 2 × 5
id dis1_event dis1_event_time dis2_event di2_event_time
<dbl> <dbl> <chr> <dbl> <chr>
1 1 1 2022-01-01 0 2022-12-31
2 2 0 2018-01-01 1 2018-03-01
> df_expected
# A tibble: 4 × 4
id disease event time
<dbl> <chr> <dbl> <chr>
1 1 dis1 1 2022-01-01
2 1 dis2 0 2022-12-31
3 2 dis1 0 2018-01-01
4 2 dis2 1 2018-03-01

问题

如何在 R 中对多列使用 pivot_longer

可重现的代码

library(tidyverse)
df = tribble(
~id, ~dis1_event, ~dis1_event_time, ~dis2_event, ~di2_event_time,
1, 1, '2022-01-01', 0, '2022-12-31',
2, 0, '2018-01-01', 1, '2018-03-01',
)


df_expected =
tribble(
~id, ~disease, ~event, ~time,
1, 'dis1', 1, '2022-01-01',
1, 'dis2', 0, '2022-12-31',
2, 'dis1', 0, '2018-01-01',
2, 'dis2', 1, '2018-03-01',
)

# my failed solution
df %>%
pivot_longer(cols = ends_with('event'), names_to = 'disease', values_to = 'event') %>%
pivot_longer(cols = ends_with('time'), names_to = 'disease', values_to = 'time')

最佳答案

这是一种可能的解决方案:

library(tidyverse)

df = tribble(
~id, ~dis1_event, ~dis1_event_time, ~dis2_event, ~dis2_event_time,
1, 1, '2022-01-01', 0, '2022-12-31',
2, 0, '2018-01-01', 1, '2018-03-01',
)

df_expected =
tribble(
~id, ~disease, ~event, ~time,
1, 'dis1', 1, '2022-01-01',
1, 'dis2', 0, '2022-12-31',
2, 'dis1', 0, '2018-01-01',
2, 'dis2', 1, '2018-03-01',
)

df
#> # A tibble: 2 × 5
#> id dis1_event dis1_event_time dis2_event dis2_event_time
#> <dbl> <dbl> <chr> <dbl> <chr>
#> 1 1 1 2022-01-01 0 2022-12-31
#> 2 2 0 2018-01-01 1 2018-03-01

df %>%
pivot_longer(-id,
names_pattern = "([a-z]+\\d)_([_a-z]+)",
names_to = c("disease", ".value"))
#> # A tibble: 4 × 4
#> id disease event event_time
#> <dbl> <chr> <dbl> <chr>
#> 1 1 dis1 1 2022-01-01
#> 2 1 dis2 0 2022-12-31
#> 3 2 dis1 0 2018-01-01
#> 4 2 dis2 1 2018-03-01

df_expected
#> # A tibble: 4 × 4
#> id disease event time
#> <dbl> <chr> <dbl> <chr>
#> 1 1 dis1 1 2022-01-01
#> 2 1 dis2 0 2022-12-31
#> 3 2 dis1 0 2018-01-01
#> 4 2 dis2 1 2018-03-01

reprex package 创建于 2022-11-15 (v2.0.1)

“时间”列被命名为“event_time”,但您可以使用 ... %>% rename("time"= "event_time") 修复它

关于r - R 中具有多列的 pivot_longer,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74440042/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com