% pivot_longer(cols = ends_with("aoi-6ren">
gpt4 book ai didi

r - 如何在逗号分隔值数量不等的多列上进行 pivot_longer

转载 作者:行者123 更新时间:2023-12-04 00:49:00 26 4
gpt4 key购买 nike

我有一些看起来很乱的数据,其中多列有多个逗号分隔值:

df <- data.frame(
Line = 1:2,
Utterance = c("hi there", "how're ya"),
A_aoi = c("C*B*C", "*"),
A_aoi_dur = c("100,25,30,50,144", "200"),
B_aoi = c("*A", "*A*A*C"),
B_aoi_dur = c("777,876", "50,22,33,100,150,999")
)

我想做的是 pivot_longer,这样每个逗号分隔值都有自己的行。我可以完成这个,但看起来我完成的方式是什么,因为它涉及多个中间步骤和临时 df 使代码冗长和繁重:

library(dplyr)
library(tidyr)

# first temporary `df`:
df1 <- df %>%
select(-ends_with("dur")) %>%
pivot_longer(cols = ends_with("aoi"),
names_to = "Speaker") %>%
separate_rows(value, sep = "(?!^|$)") %>%
mutate(Speaker = sub("^(.).*", "\\1", Speaker)) %>%
rename(AOI = value)

# second temporary `df`:
df2 <- df %>%
select(-ends_with("aoi")) %>%
pivot_longer(cols = ends_with("dur")) %>%
separate_rows(value, sep = ",") %>%
rename(Dur = value)

# final `df` (aka, the **expected outcome**):
df3 <- cbind(df1, df2[,4])

df3
Line Utterance Speaker AOI Dur
1 1 hi there A C 100
2 1 hi there A * 25
3 1 hi there A B 30
4 1 hi there A * 50
5 1 hi there A C 144
6 1 hi there B * 777
7 1 hi there B A 876
8 2 how're ya A * 200
9 2 how're ya B * 50
10 2 how're ya B A 22
11 2 how're ya B * 33
12 2 how're ya B A 100
13 2 how're ya B * 150
14 2 how're ya B C 999

如何更简洁地实现这种转变?

最佳答案

这里是tidyverse中的一个选项

  1. 我们通过粘贴'_AOI'重命名(rename_with)ends_with '_aoi'
  2. 从“宽” reshape 为“长”- pivot_longer
  3. 在“AOI”中的每个字符之间插入分隔符 , 以形成通用分隔符 - str_replace_all
  4. 最后,使用 separate_rows, 分隔符上
library(dplyr)
library(tidyr)
library(stringr)
df %>%
rename_with(~ str_c(., "_AOI"), ends_with("_aoi")) %>%
pivot_longer(cols = contains("_"),
names_to = c("Speaker", ".value"), names_pattern = "^(.*)_([^_]+$)") %>%
mutate(AOI = str_replace_all(AOI, "(?<=.)(?=.)", ",")) %>%
separate_rows(c(AOI, dur), sep = ",", convert = TRUE)

-输出

# A tibble: 14 x 5
Line Utterance Speaker AOI dur
<int> <chr> <chr> <chr> <int>
1 1 hi there A_aoi C 100
2 1 hi there A_aoi * 25
3 1 hi there A_aoi B 30
4 1 hi there A_aoi * 50
5 1 hi there A_aoi C 144
6 1 hi there B_aoi * 777
7 1 hi there B_aoi A 876
8 2 how're ya A_aoi * 200
9 2 how're ya B_aoi * 50
10 2 how're ya B_aoi A 22
11 2 how're ya B_aoi * 33
12 2 how're ya B_aoi A 100
13 2 how're ya B_aoi * 150
14 2 how're ya B_aoi C 999

关于r - 如何在逗号分隔值数量不等的多列上进行 pivot_longer,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68120708/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com