gpt4 book ai didi

r - 当有多个数据点时复制信息

转载 作者:行者123 更新时间:2023-12-05 02:26:32 26 4
gpt4 key购买 nike

我有一个数据清理问题。数据收集发生了三次,有时数据输入不正确。因此,如果学生多次收集数据,则需要复制第二个数据点。

这是我的数据集:

df <- data.frame(id = c(1,1,1, 2,2,2, 3,3,  4,4, 5),
text = c("female","male","male", "female","female","female", "male","female","male", "female", "female"),
time = c("first","second","third", "first","second","third", "first","second","second", "third", "first"))

> df
id text time
1 1 female first
2 1 male second
3 1 male third
4 2 female first
5 2 female second
6 2 female third
7 3 male first
8 3 female second
9 4 male second
10 4 female third
11 5 female first

因此 id 1,3, 4 的性别信息不正确。当有关于 gender 变量的多个/不同输入时,我需要复制 second 数据点。如果只有一个数据点,那应该保留在数据集中。

期望的输出是

> df1
id text time
1 1 male first
2 1 male second
3 1 male third
4 2 female first
5 2 female second
6 2 female third
7 3 female first
8 3 female second
9 4 male second
10 4 male third
11 5 female first

有什么想法吗?谢谢!

最佳答案

只是另一种有趣的方法;

library(dplyr)

df %>%
filter(time =="second") %>%
select(-time) %>%
full_join(df, ., by ="id", suffix = c("_old", "")) %>%
mutate(text = coalesce(text, text_old)) %>%
select(names(df))

#> id text time
#> 1 1 male first
#> 2 1 male second
#> 3 1 male third
#> 4 2 female first
#> 5 2 female second
#> 6 2 female third
#> 7 3 female first
#> 8 3 female second
#> 9 4 male second
#> 10 4 male third
#> 11 5 female first

关于r - 当有多个数据点时复制信息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73736230/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com