gpt4 book ai didi

r - 如何从 R 中的数据框中去除美元符号 ($)?

转载 作者:行者123 更新时间:2023-12-05 08:34:29 25 4
gpt4 key购买 nike

我是 R 的新手,正在为看似极其简单的查询而苦苦挣扎。

我已经使用 read.csv 将一个 csv 文件导入到 R 中,并试图在整理数据和进一步分析之前删除美元符号($)(美元符号对图表造成严重破坏)。

我一直在尝试使用 dplyr 和 gsub 从数据框中去除 $,但没有成功,我非常感谢有关如何去做的一些建议。

我的数据框是这样的:

> str(data)
'data.frame': 50 obs. of 17 variables:
$ Year : int 1 2 3 4 5 6 7 8 9 10 ...
$ Prog.Cost : Factor w/ 2 levels "-$3,333","$0": 1 2 2 2 2 2 2 2 2 2 ...
$ Total.Benefits : Factor w/ 44 levels "$2,155","$2,418",..: 25 5 7 11 12 10 9 14 13 8 ...
$ Net.Cash.Flow : Factor w/ 45 levels "-$2,825","$2,155",..: 1 6 8 12 13 11 10 15 14 9 ...
$ Participant : Factor w/ 46 levels "$0","$109","$123",..: 1 1 1 45 46 2 3 4 5 6 ...
$ Taxpayer : Factor w/ 48 levels "$113","$114",..: 19 32 35 37 38 40 41 45 48 47 ...
$ Others : Factor w/ 47 levels "-$9","$1,026",..: 12 25 26 24 23 11 9 10 8 7 ...
$ Indirect : Factor w/ 42 levels "-$1,626","-$2",..: 1 6 10 18 22 24 28 33 36 35 ...
$ Crime : Factor w/ 35 levels "$0","$1","$10",..: 6 11 13 19 21 23 28 31 33 32 ...
$ Child.Welfare : Factor w/ 1 level "$0": 1 1 1 1 1 1 1 1 1 1 ...
$ Education : Factor w/ 1 level "$0": 1 1 1 1 1 1 1 1 1 1 ...
$ Health.Care : Factor w/ 38 levels "-$10","-$11",..: 7 7 7 7 2 8 12 36 30 9 ...
$ Welfare : Factor w/ 1 level "$0": 1 1 1 1 1 1 1 1 1 1 ...
$ Earnings : Factor w/ 41 levels "$0","$101","$104",..: 1 1 1 22 23 24 25 26 27 28 ...
$ State.Benefits : Factor w/ 37 levels "$102","$117",..: 37 1 3 4 6 10 12 18 24 27 ...
$ Local.Benefits : Factor w/ 24 levels "$115","$136",..: 24 1 2 12 14 16 19 22 23 21 ...
$ Federal.Benefits: Factor w/ 39 levels "$0","$100","$102",..: 1 1 1 12 12 17 20 19 19 21 ...

最佳答案

如果您只需要删除$ 并且不想更改列的class

indx <- sapply(data, is.factor) 
data[indx] <- lapply(data[indx], function(x)
as.factor(gsub("\\$", "", x)))

如果您需要 numeric 列,您也可以去掉 ,(由@David 提供 Arenburg) 并通过 as.numeric

转换为 numeric
data[indx] <- lapply(data[indx], function(x) as.numeric(gsub("[,$]", "", x)))

你可以把它包装在一个函数里

f1 <- function(dat, pat="[$]", Class="factor"){
indx <- sapply(dat, is.factor)
if(Class=="factor"){
dat[indx] <- lapply(dat[indx], function(x) as.factor(gsub(pat, "", x)))
}
else {
dat[indx] <- lapply(dat[indx], function(x) as.numeric(gsub(pat, "", x)))
}
dat
}

f1(data)
f1(data, pat="[,$]", "numeric")

数据

set.seed(24)
data <- data.frame(Year=1:6, Prog.Cost= sample(c("-$3,3333", "$0"),
6, replace=TRUE), Total.Benefits= sample(c("$2,155","$2,418",
"$2,312"), 6, replace=TRUE))

关于r - 如何从 R 中的数据框中去除美元符号 ($)?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26728750/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com