gpt4 book ai didi

r - 重复时如何根据优先级重新分类/替换值

转载 作者:行者123 更新时间:2023-12-05 08:31:49 25 4
gpt4 key购买 nike

我有一个 df,其中 value 表示 drug 的状态:

g1 = data.frame ( 
drug = c('a','a','a','d','d'),
value = c('fda','trial','case','case','pre')
)

drug value
1 a fda
2 a trial
3 a case
4 d case
5 d pre

所以对于药物,我想根据以下 value 的优先级顺序替换任何重复的 drug:

fda > trial > case > pre 

例如,如果药物 d 是“case”和“pre”,则 d 的所有发生率都将被重新分类为“case”。决赛 table 应该是这样的。

  drug value
1 a fda
2 a fda
3 a fda
4 d case
5 d case

如何在不必遍历每种药物并先确定优先级然后替换的情况下执行此操作?

最佳答案

由于这是一个序数变量,您可以使 g1$value 成为一个 ordered 因子作为相应的 class。然后您可以像使用数字一样使用 minmax 等函数:

g1$value <- ordered(g1$value, levels = c("fda", "trial", "case", "pre"))
g1$value
#[1] fda trial case case pre
#Levels: fda < trial < case < pre
g1$value <- ave(g1$value, g1$drug, FUN=min)
g1
# drug value
#1 a fda
#2 a fda
#3 a fda
#4 d case
#5 d case

或者在 dplyr 中说:

g1 %>%
mutate(value = ordered(value, levels = c("fda", "trial", "case", "pre"))) %>%
group_by(drug) %>%
mutate(value = min(value))

数据集中的顺序和任何 drug 组中存在的值范围不应影响此结果:

g2 = data.frame ( 
drug = c( "a","a","a","d","d","e","e","e"),
value = c("fda","trial","case","case","pre","pre","fda","case")
)

# drug value
#1 a fda
#2 a trial
#3 a case
#4 d case
#5 d pre
#6 e pre
#7 e fda
#8 e case

g2 %>%
mutate(value = ordered(value, levels = c("fda", "trial", "case", "pre"))) %>%
group_by(drug) %>%
mutate(value = min(value))

## A tibble: 8 x 2
## Groups: drug [3]
# drug value
# <fct> <ord>
#1 a fda
#2 a fda
#3 a fda
#4 d case
#5 d case
#6 e fda
#7 e fda
#8 e fda

关于r - 重复时如何根据优先级重新分类/替换值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55153251/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com