gpt4 book ai didi

r - 根据 r 中列的值添加缺失值

转载 作者:行者123 更新时间:2023-12-02 09:14:57 24 4
gpt4 key购买 nike

这是我的示例数据集:

   vector1 <-
data.frame(
"name" = "a",
"age" = 10,
"fruit" = c("orange", "cherry", "apple"),
"count" = c(1, 1, 1),
"tag" = c(1, 1, 2)
)
vector2 <-
data.frame(
"name" = "b",
"age" = 33,
"fruit" = c("apple", "mango"),
"count" = c(1, 1),
"tag" = c(2, 2)
)
vector3 <-
data.frame(
"name" = "c",
"age" = 58,
"fruit" = c("cherry", "apple"),
"count" = c(1, 1),
"tag" = c(1, 1)
)

list <- list(vector1, vector2, vector3)
print(list)

这是我的测试:

default <- c("cherry",
"orange",
"apple",
"mango")

for (num in 1:length(list)) {
#print(list[[num]])

list[[num]] <- rbind(
list[[num]],
data.frame(
"name" = list[[num]]$name,
"age" = list[[num]]$age,
"fruit" = setdiff(default, list[[num]]$fruit),#add missed value
"count" = 0,
"tag" = 1 #not found solutions
)
)

print(paste0("--------------", num, "--------"))
print(list)
}
#print(list)

我试图根据标签的值找出数据框中漏掉了哪个水果,并且该水果是基于标签的值。例如,在第一个数据框中,有标签1和2。如果标签1的值没有苹果、香蕉等默认水果,则缺少的默认水果将在数据框中添加0。期望格式如下:

[[1]]
name age fruit count tag
1 a 10 orange 1 1
2 a 10 cherry 1 1
3 a 10 apple 1 2
4 a 10 mango 0 1
5 a 10 apple 0 1
6 a 10 mango 0 2
7 a 10 orange 0 2
8 a 10 cherry 0 2

当我检查循环的过程时,我还发现第一个循环添加了 3 次 mango,但我没有找到它不能一次性添加缺失值的原因。总体输出如下:

[[1]]
name age fruit count tag
1 a 10 orange 1 1
2 a 10 cherry 1 1
3 a 10 apple 1 2
4 a 10 mango 0 1
5 a 10 mango 0 1
6 a 10 mango 0 1

[[2]]
name age fruit count tag
1 b 33 apple 1 2
2 b 33 mango 1 2
3 b 33 cherry 0 1
4 b 33 orange 0 1

[[3]]
name age fruit count tag
1 c 58 cherry 1 1
2 c 58 apple 1 1
3 c 58 orange 0 1
4 c 58 mango 0 1

有人帮助我并提供简单的方法或其他方法吗?我应该使用 sqldf 函数添加 0 值吗?这是解决我的问题的简单方法吗?

最佳答案

考虑基本的 R 方法 --lapplyexpand.gridtransformrbind聚合——将所有可能的fruittag选项附加到每个数据帧并保留最大计数。

new_list <- lapply(list, function(df) {
fruit_tag_df <- transform(expand.grid(fruit=c("apple", "cherry", "mango", "orange"),
tag=c(1,2)),
name = df$name[1],
age = df$age[1],
count = 0)

aggregate(.~name + age + fruit + tag, rbind(df, fruit_tag_df), FUN=max)
})

输出

new_list

# [[1]]
# name age fruit tag count
# 1 a 10 apple 1 0
# 2 a 10 cherry 1 1
# 3 a 10 orange 1 1
# 4 a 10 mango 1 0
# 5 a 10 apple 2 1
# 6 a 10 cherry 2 0
# 7 a 10 orange 2 0
# 8 a 10 mango 2 0

# [[2]]
# name age fruit tag count
# 1 b 33 apple 1 0
# 2 b 33 mango 1 0
# 3 b 33 cherry 1 0
# 4 b 33 orange 1 0
# 5 b 33 apple 2 1
# 6 b 33 mango 2 1
# 7 b 33 cherry 2 0
# 8 b 33 orange 2 0

# [[3]]
# name age fruit tag count
# 1 c 58 apple 1 1
# 2 c 58 cherry 1 1
# 3 c 58 mango 1 0
# 4 c 58 orange 1 0
# 5 c 58 apple 2 0
# 6 c 58 cherry 2 0
# 7 c 58 mango 2 0
# 8 c 58 orange 2 0

关于r - 根据 r 中列的值添加缺失值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48043372/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com