% mutat-6ren">
gpt4 book ai didi

r - 连接列中的特定字符串

转载 作者:行者123 更新时间:2023-12-01 11:11:23 25 4
gpt4 key购买 nike

我有一个这样的数据框:

df <- data.frame("region" = c("Spain", "Barcelona", "Madrid",
"France", "Paris", "Lyon",
"Belgium", "Bruges", "Brussels"),
"2010" = 1:9, "2011" = c(NA, 1, 2, NA, 3, 4, NA, 5, 6))

我想连接国家名称和城市名称。所有国家名称的行都有 NA,每个城市名称都在国家名称之后。

我想要的数据框是这样的:

desired_df <- data.frame("region" = c("Spain_Spain", "Spain_Barcelona", "Spain_Madrid",
"France_France", "France_Paris", "France_Lyon",
"Belgium_Belgium", "Belgium_Bruges", "Belgium_Brussels"),
"2010" = 1:9, "2011" = c(NA, 1, 2, NA, 3, 4, NA, 5, 6))

如果缺少 country_country 行也没关系。任何帮助将不胜感激。

最佳答案

我们可以根据国家名称的出现创建一个分组变量,并将'region'的第一个元素与'region'的其他元素粘贴以更新“区域”列

library(dplyr)
library(stringr)
df %>%
group_by(grp = cumsum(region %in% c("Spain", "France", "Belgium"))) %>%
mutate(region = str_c(first(region), region, sep="_")) %>%
ungroup %>%
select(-grp)
# A tibble: 9 x 3
# region X2010 X2011
# <chr> <int> <dbl>
#1 Spain_Spain 1 NA
#2 Spain_Barcelona 2 1
#3 Spain_Madrid 3 2
#4 France_France 4 NA
#5 France_Paris 5 3
#6 France_Lyon 6 4
#7 Belgium_Belgium 7 NA
#8 Belgium_Bruges 8 5
#9 Belgium_Brussels 9 6

或者如@akash87 所述,如果模式应基于“X2011”

df %>%
group_by(grp = cumsum(is.na(X2011))) %>%
mutate(region = str_c(first(region), region, sep="_")) %>%
ungroup %>%
select(-grp)

关于r - 连接列中的特定字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59901830/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com