gpt4 book ai didi

替换 R 数据框中变量中的特定字符

转载 作者:行者123 更新时间:2023-12-04 04:55:13 25 4
gpt4 key购买 nike

我要全部更换, , - , ) , ( (空格)与 .来自示例数据帧中的变量 DMA.NAME。我引用了三个帖子并尝试了他们的方法,但都失败了。:

Replacing column values in data frame, not included in list

R replace all particular values in a data frame

Replace characters from a column of a data frame R

方法一

> shouldbecomeperiod <- c$DMA.NAME %in% c("-", ",", " ", "(", ")")
c$DMA.NAME[shouldbecomeperiod] <- "."

方法二
> removetext <- c("-", ",", " ", "(", ")")
c$DMA.NAME <- gsub(removetext, ".", c$DMA.NAME)
c$DMA.NAME <- gsub(removetext, ".", c$DMA.NAME, fixed = TRUE)

Warning message:
In gsub(removetext, ".", c$DMA.NAME) :
argument 'pattern' has length > 1 and only the first element will be used

方法三
> c[c == c(" ", ",", "(", ")", "-")] <- "."

示例数据框
> df
DMA.CODE DATE DMA.NAME count
111 22 8/14/2014 12:00:00 AM Columbus, OH 1
112 23 7/15/2014 12:00:00 AM Orlando-Daytona Bch-Melbrn 1
79 18 7/30/2014 12:00:00 AM Boston (Manchester) 1
99 22 8/20/2014 12:00:00 AM Columbus, OH 1
112.1 23 7/15/2014 12:00:00 AM Orlando-Daytona Bch-Melbrn 1
208 27 7/31/2014 12:00:00 AM Minneapolis-St. Paul 1

我知道问题所在 - gsub使用模式和只有第一个元素。另外两种方法是在整个变量中搜索精确值,而不是在值内搜索特定字符。

最佳答案

您可以使用特殊组[:punct:][:space:]像这样的模式组( [...] )内部:

df <- data.frame(
DMA.NAME = c(
"Columbus, OH",
"Orlando-Daytona Bch-Melbrn",
"Boston (Manchester)",
"Columbus, OH",
"Orlando-Daytona Bch-Melbrn",
"Minneapolis-St. Paul"),
stringsAsFactors=F)
##
> gsub("[[:punct:][:space:]]+","\\.",df$DMA.NAME)
[1] "Columbus.OH" "Orlando.Daytona.Bch.Melbrn" "Boston.Manchester." "Columbus.OH"
[5] "Orlando.Daytona.Bch.Melbrn" "Minneapolis.St.Paul"

关于替换 R 数据框中变量中的特定字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26470307/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com