gpt4 book ai didi

重新编码因子水平

转载 作者:行者123 更新时间:2023-12-04 11:44:24 24 4
gpt4 key购买 nike

我有以下数据框:

forStack
AGE BMI time A B ID
1 59 23.8 0 (0,75] (4,14.9] 9000099
2 69 29.8 0 (96.4,100] (-Inf,0] 9000296
3 71 22.7 0 (75,89.3] (4,14.9] 9000622
4 56 32.4 0 (0,75] (14.9,68] 9000798
5 72 30.7 0 (0,75] (14.9,68] 9001104
6 75 23.5 0 (96.4,100] (0,4] 9001400

dput (forStack)
structure(list(AGE = c(59, 69, 71, 56, 72, 75), BMI = c(23.8,
29.8, 22.7, 32.4, 30.7, 23.5), time = c(0, 0, 0, 0, 0, 0), A = structure(c(2L,
5L, 3L, 2L, 2L, 5L), .Label = c("(-Inf,0]", "(0,75]", "(75,89.3]",
"(89.3,96.4]", "(96.4,100]", "(100, Inf]"), class = "factor"),
B = structure(c(3L, 1L, 3L, 4L, 4L, 2L), .Label = c("(-Inf,0]",
"(0,4]", "(4,14.9]", "(14.9,68]", "(68, Inf]"), class = "factor"),
ID = c(9000099, 9000296, 9000622, 9000798, 9001104, 9001400
)), .Names = c("AGE", "BMI", "time", "A", "B", "ID"), row.names = c(NA,
6L), class = "data.frame")

变量AB是代表四分位数的因子:

   forStack$A
[1] (0,75] (96.4,100] (75,89.3] (0,75] (0,75] (96.4,100]
Levels: (-Inf,0] (0,75] (75,89.3] (89.3,96.4] (96.4,100] (100, Inf]

forStack$B
[1] (4,14.9] (-Inf,0] (4,14.9] (14.9,68] (14.9,68] (0,4]
Levels: (-Inf,0] (0,4] (4,14.9] (14.9,68] (68, Inf]

我想将 AB 值重新编码为两级因子,如下所示:

对于A,上面的因子水平(96.4,100](100, Inf]应该重新编码为0水平,其他级别 - 作为 1 级

对于 B 最低的因子级别 (-Inf,0](0,4] 应重新编码为 0 级别,其他级别 - 作为 1 级

因此,数据框应如下所示:

 forStack
AGE BMI time A B ID
1 59 23.8 0 1 1 9000099
2 69 29.8 0 0 0 9000296
3 71 22.7 0 1 1 9000622
4 56 32.4 0 1 1 9000798
5 72 30.7 0 1 1 9001104
6 75 23.5 0 0 0 9001400

最有效的方法是什么?提前非常感谢您

最佳答案

这是一种方法:

within(forStack, {
A <- as.numeric(!A %in% tail(levels(A), 2))
B <- as.numeric(!B %in% head(levels(B), 2))
})
# AGE BMI time A B ID
# 1 59 23.8 0 1 1 9000099
# 2 69 29.8 0 0 0 9000296
# 3 71 22.7 0 1 1 9000622
# 4 56 32.4 0 1 1 9000798
# 5 72 30.7 0 1 1 9001104
# 6 75 23.5 0 0 0 9001400

这里的基本思想是 headtail 都有一个“n”参数,可以让你指定你想要多少个值矢量或数据集的“头”和“尾”。这让我们可以轻松地获取向量 A 的 (96.4,100](100, Inf] 以及向量 B 的相关值。

within 是动态替换 data.frame 中的值的便捷方式。

关于重新编码因子水平,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16271861/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com