gpt4 book ai didi

r - 每次根据前一列对匹配模式的所有列进行变异

转载 作者:行者123 更新时间:2023-12-03 16:56:36 27 4
gpt4 key购买 nike

每次使用 mutate_at 使用前一列时,我如何改变包含模式( dplyr 我猜)的所有列?
--> 这里例如所有列继续 foo在他们的名字中应该使用前面的列进行变异(即 a 代表 fooa 列, b 代表 foob 等等)。


set.seed(13)
dfrows = 5
df = data.frame(a = rnorm(dfrows),
fooa = runif(dfrows),
b = rnorm(dfrows, mean=50, sd=5),
foob = runif(dfrows, min=0, max=5),
c = rnorm(dfrows, mean=100, sd=10),
fooc = runif(dfrows, min=0, max=10))
df
# a fooa b foob c fooc
# 1 0.5543269 0.6611216 48.26791 3.0999527 98.06053 6.035485
# 2 -0.2802719 0.8783709 51.15647 0.1586242 113.96432 2.299504
# 3 1.7751634 0.8905590 52.34582 2.3070636 101.00663 9.668332
# 4 0.1873201 0.5662805 50.58978 1.6501046 98.85561 6.045547
# 5 1.1425261 0.5935473 50.35224 3.1676038 107.02225 6.396047

library(dplyr)
df %>% mutate(fooa = fooa/100 * a,
foob = foob/100 * b,
fooc = fooc/100 * c)
# a fooa b foob c fooc
# 1 0.5543269 0.003664775 48.26791 1.49628246 98.06053 5.918428
# 2 -0.2802719 -0.002461827 51.15647 0.08114656 113.96432 2.620614
# 3 1.7751634 0.015808878 52.34582 1.20765132 101.00663 9.765657
# 4 0.1873201 0.001060757 50.58978 0.83478430 98.85561 5.976363
# 5 1.1425261 0.006781434 50.35224 1.59495949 107.02225 6.845194

# Equivalently, in base R:
for (i in c(2, 4, 6)) {
df[,i] = df[,i]/100 * df[, i-1]
}

所以我正在寻找这样的东西,我猜:
# What should <PREVIOUS_COLUMN> be?
df %>% mutate_at(vars(contains('foo')), funs(./100 * <PREVIOUS_COLUMN>))

# OR, even better (more generic but in my case it will always be the previous column):
df %>% mutate_at(vars(contains('foo')), funs(./100 * <COLUMN_NAME_WITH_'foo'_PATTERN_REMOVED>))

编辑:我应该提到原始 data.frame可能包含更多的列,可能具有不同于 X then fooX 的其他模式,以便理想的解决方案应该能够正确定位它们(​​但我会保留它,因为所有答案都提供了很好的解决方案和功能)。
一个更好的例子是:
set.seed(13)
dfrows = 5
df = data.frame(a = rnorm(dfrows),
fooa = runif(dfrows),
b = rnorm(dfrows, mean=50, sd=5),
foob = runif(dfrows, min=0, max=5),
bla = 5,
c = rnorm(dfrows, mean=100, sd=10),
fooc = runif(dfrows, min=0, max=10),
blo = 8)
df
# a fooa b foob bla c fooc blo
# 1 0.5543269 0.6611216 48.26791 3.0999527 5 98.06053 6.035485 8
# 2 -0.2802719 0.8783709 51.15647 0.1586242 5 113.96432 2.299504 8
# 3 1.7751634 0.8905590 52.34582 2.3070636 5 101.00663 9.668332 8
# 4 0.1873201 0.5662805 50.58978 1.6501046 5 98.85561 6.045547 8
# 5 1.1425261 0.5935473 50.35224 3.1676038 5 107.02225 6.396047 8

最佳答案

这是使用 across() 的另一种方法和 cur_column() .我个人不建议根据列的位置进行计算,而是建议使用列名称,因为这看起来更安全。
在下面的示例中,我们遍历列 a , bcacross并访问每个对应的值 foo列使用 get()cur_column .

set.seed(13)
dfrows = 5
df = data.frame(a = rnorm(dfrows),
fooa = runif(dfrows),
b = rnorm(dfrows, mean=50, sd=5),
foob = runif(dfrows, min=0, max=5),
c = rnorm(dfrows, mean=100, sd=10),
fooc = runif(dfrows, min=0, max=10))

library(dplyr)

df %>%
mutate(across(matches("^[a-z]$"),
~ get(paste0("foo", cur_column())) / 100 * .x,
.names = "foo{col}"))
#> a fooa b foob c fooc
#> 1 0.5543269 0.003664775 48.26791 1.49628246 98.06053 5.918428
#> 2 -0.2802719 -0.002461827 51.15647 0.08114656 113.96432 2.620614
#> 3 1.7751634 0.015808878 52.34582 1.20765132 101.00663 9.765657
#> 4 0.1873201 0.001060757 50.58978 0.83478430 98.85561 5.976363
#> 5 1.1425261 0.006781434 50.35224 1.59495949 107.02225 6.845194
创建于 2021-01-27 由 reprex package (v0.3.0)

关于r - 每次根据前一列对匹配模式的所有列进行变异,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65915885/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com