gpt4 book ai didi

Recode multiple values in multiple columns with new values in R(用R中的新值重新编码多列中的多个值)

转载 作者:bug小助手 更新时间:2023-10-24 23:01:45 30 4
gpt4 key购买 nike



I have data frame with 18 columns. Columns 2 to 13 include numeric values such as 0, 1, 2, 4 ... I want to recode them based on range into three categories:

我有18列的数据框。列2到13包括数值,如0、1、2、4...我想根据范围将它们重新编码为三类:


if columns 2:13 are 0 -> 0
if columns 2:13 between 1 & 5 -> 1
else columns 2:13 >- 2.

My attempt works, but not efficient:

我的尝试奏效了,但效率不高:


df[,2:13][df[,2:13] == 1 | df[,2:13] == 2 | df[,2:13] == 3 | df[,2:13] == 4 | df[,2:13] == 5] <- 1

I appreciate your help.

我很感谢你的帮助。


更多回答
优秀答案推荐


Try findInterval:

尝试findInterval:


dplyr


library(dplyr)
df %>%
mutate(
across(2:13, ~ findInterval(., c(0, 1, 5), rightmost.closed = TRUE) - 1L)
)

If this gets any more complex (such as non-consecutive recoded values), we might switch to case_when:

如果这变得更加复杂(例如非连续的重新编码值),我们可能会切换到CASE_WHEN:


df %>%
mutate(
across(2:13, ~ case_when(
. == 0 ~ 0L,
between(., 1, 5) ~ 1L,
TRUE ~ 2L
))
)

base R


df[,2:13] <- lapply(df[,2:13], function(z) findInterval(z, c(0, 1, 5), rightmost.closed = TRUE) - 1L)

更多回答

Awesome! Thank you so much. I was not aware of findInterval function.

太棒了!非常感谢。我不知道findInterval函数。

It's very similar to cut, useful for returning strings/labels (including number-range looking things)

它非常类似于Cut,对于返回字符串/标签非常有用(包括查找数字范围的内容)

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com