gpt4 book ai didi

r - 根据逗号分割数据框列

转载 作者:行者123 更新时间:2023-12-01 16:22:49 24 4
gpt4 key购买 nike

我有一个具有以下结构的数据框,标题为“final_proj_data”

ID          County              Population     Year  
<dbl> <chr> <dbl> <dbl>
1003 Baldwin County, Alabama 169162 2006
1015 Calhoun County, Alabama 112903 2006
1043 Cullman County, Alabama 80187 2006
1049 DeKalb County, Alabama 68014 2006

我试图将“县”列拆分为两个不同的列:“县”和“州”,并删除逗号。

我尝试了 split() 函数的多种排列,但我不断收到此错误:

Error: var must evaluate to a single number or a column name, not a character vector

我已经尝试过(除其他外)

  final_proj_data %>% 
separate(final_proj_data$County, c("State", "County"), sep = ",", remove = TRUE)
final_proj_data %>%
separate(data = final_proj_data, col = County,
into = c("State", "County"), sep = ",")

我不确定我做错了什么,或者为什么“col =”不断抛出此错误。任何帮助将不胜感激!

最佳答案

使用dplyr和基础R:

library(dplyr)
final_proj_data %>%
mutate(State=unlist(lapply(strsplit(County,", "),function(x) x[2])),
County=gsub(",.*","",County))
ID County Population Year State
1 1003 Baldwin County 169162 2006 Alabama
2 1015 Calhoun County 112903 2006 Alabama
3 1043 Cullman County 80187 2006 Alabama
4 1049 DeKalb County 68014 2006 Alabama

原文:

使用dplyrtidyr(刚刚看到@Ronak Shah在上面评论了相同的内容):

library(dplyr)
library(tidyr)
final_proj_data %>%
separate(County,c("County","State"),sep=",")
ID County State Population Year
1 1003 Baldwin County Alabama 169162 2006
2 1015 Calhoun County Alabama 112903 2006
3 1043 Cullman County Alabama 80187 2006
4 1049 DeKalb County Alabama 68014 2006

关于r - 根据逗号分割数据框列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55886633/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com