gpt4 book ai didi

r - 清除 R 数据框中各列的重复项

转载 作者:行者123 更新时间:2023-12-04 07:42:54 25 4
gpt4 key购买 nike

我有一个包含三列重复项的数据框:

Name      Year     Job1      Job2       Job3
Bob 2011 director director chair
Bob 2012 director chair
Wendy 2011 advisor chair advisor
Henry 2010 CEO president president
我想删除每行“job1”、“job2”和“job3”列中的重复项:
Name      Year     Job1      Job2       Job3
Bob 2011 director NA chair
Bob 2012 director chair
Wendy 2011 advisor chair NA
Henry 2010 CEO president NA
基本上,如果存在重复项,则保留前一列中的值,删除后一列中的值(例如,如果“job1”和“job2”之间存在重复项,则保留“job1”中的值)。

最佳答案

我们可以按行循环遍历“作业”列,并用 NA 替换重复项

nm1 <- grep('^Job\\d+$', names(df1))
df1[nm1] <- t(apply(df1[nm1], 1, function(x) replace(x, duplicated(x), NA)))
-输出
df1
# Name Year Job1 Job2 Job3
#1 Bob 2011 director <NA> chair
#2 Bob 2012 director chair
#3 Wendy 2011 advisor chair <NA>
#4 Henry 2010 CEO president <NA>
数据
df1 <- structure(list(Name = c("Bob", "Bob", "Wendy", "Henry"), Year = c(2011L, 
2012L, 2011L, 2010L), Job1 = c("director", "director", "advisor",
"CEO"), Job2 = c("director", "chair", "chair", "president"),
Job3 = c("chair", "", "advisor", "president")),
class = "data.frame", row.names = c(NA,
-4L))

关于r - 清除 R 数据框中各列的重复项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67360091/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com