gpt4 book ai didi

r - 如何合并一列中的行以匹配另一列中的非空行?

转载 作者:行者123 更新时间:2023-12-02 03:03:04 25 4
gpt4 key购买 nike

我有一个包含两列的 .csv 文件。第一个是 ID,第二个是文本字段。但是,文本字段中的文本被拆分成句子并延伸到另一行,因此文件如下所示:

ID TEXT
TXT_1 This is the first sentence
NA This is the second sentence
NA This is the third sentence
TXT_2 This is the first sentence of the second text
NA This is the second sentence of the second text

我想做的是合并文本字段,使其看起来像这样:

ID TEXT
TXT_1 This is the first sentence This is the second sentence This is the third sentence
TXT_2 This is the first sentence of the second text This is the second sentence of the second text

在 R 中有一个简单的解决方案吗?

最佳答案

我们根据“ID”中的非 NA 元素创建一个分组变量,并将“TEXT”粘贴在一起

library(dplyr)
df1 %>%
group_by(Grp = cumsum(!is.na(ID))) %>%
summarise(ID = ID[!is.na(ID)], TEXT = paste(TEXT, collapse = ' ')) %>%
ungroup() %>%
select(-Grp)
# A tibble: 2 x 2
# ID TEXT
# <chr> <chr>
#1 TXT_1 This is the first sentence This is the second sentence This is the third sentence
#2 TXT_2 This is the first sentence of the second text This is the second sentence of the second text

或者按照@Jaap的建议

df1 %>% 
group_by(ID = zoo::na.locf(ID)) %>%
summarise(TEXT = paste(TEXT, collapse = ' '))

关于r - 如何合并一列中的行以匹配另一列中的非空行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44695235/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com