gpt4 book ai didi

使用 'R'中的数据表迭代替换NA

转载 作者:行者123 更新时间:2023-12-04 00:57:47 25 4
gpt4 key购买 nike

我正在尝试用来自适当组的随机样本替换 NA。例如,在第 2 行中,NA 来自“法国”,年龄和时间为“20-30”“30-40”。因此,我想对所有其他“法国”、“20-30”、“30-40”观察结果的响应列进行随机抽样。

我有下面的代码,效果很好,但每个值都被替换为相同的随机样本。例如,如果我有多个“法国”、“20-30”、“30-40”NA,则它们对应的 R2 将相同。

我希望每个 NA 都被独立采样,但 data.table 似乎“一次”完成,因此我不能那样做。有什么想法吗?

DT <- data.table(mydf, key = "Country,Age,Time")
DT[, R2 := ifelse(is.na(Response), sample(na.omit(Response), 1),
Response), by = key(DT)]
DT
# Index Country Age Time Response R2
# 1: 5 France 20-30 30-40 1 1
# 2: 6 France 20-30 30-40 NA 2
# 3: 7 France 20-30 30-40 2 2
# 4: 1 Germany 20-30 15-20 1 1
# 5: 2 Germany 20-30 15-20 NA 1
# 6: 3 Germany 20-30 15-20 1 1
# 7: 4 Germany 20-30 15-20 0 0

mydf 在哪里

mydf <- structure(list(Index = 1:7, Country = c("Germany", "Germany", 
"Germany", "Germany", "France", "France", "France"), Age = c("20-30",
"20-30", "20-30", "20-30", "20-30", "20-30", "20-30"), Time = c("15-20",
"15-20", "15-20", "15-20", "30-40", "30-40", "30-40"), Response = c(1L,
NA, 1L, 0L, 1L, NA, 2L)), .Names = c("Index", "Country", "Age",
"Time", "Response"), class = "data.frame", row.names = c(NA, -7L))

最佳答案

我会这样做:

DT[, is_na := is.na(Response)]
nas <- DT[, sample(Response[!is_na], sum(is_na), TRUE) ,
by=list(Country, Age, Time)]$V1
DT[, R2 := Response][(is_na), R2 := nas]

关于使用 'R'中的数据表迭代替换NA,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21662915/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com