gpt4 book ai didi

r - "Marking"R 中的重复项

转载 作者:行者123 更新时间:2023-12-05 00:52:45 28 4
gpt4 key购买 nike

我正在使用 R 编程语言。假设我有以下数据:

 Data_I_Have <- data.frame(

"Person" = c("John", "John", "John", "Peter", "Peter", "Peter", "Tim", "Kevin", "Adam", "Adam", "Xavier"),
"Number_of_Kids" = c("4", "1", "1", "5", "2", "3", "7", "0", "3", "3", "5")

)

Person Number_of_Kids
1 John 4
2 John 1
3 John 1
4 Peter 5
5 Peter 2
6 Peter 3
7 Tim 7
8 Kevin 0
9 Adam 3
10 Adam 3
11 Xavier 5

是否可以“标记”每个重复的名称,使其看起来像下面的文件(例如 John_1、John_2 等)?

Data_I_Want <- data.frame(

"Person" = c("John_1", "John_2", "John_3", "Peter_1", "Peter_2", "Peter_3", "Tim", "Kevin", "Adam_1", "Adam_2", "Xavier"),
"Number_of_Kids" = c("4", "1", "1", "5", "2", "3", "7", "0", "3", "3", "5")

)

Person Number_of_Kids
1 John_1 4
2 John_2 1
3 John_3 1
4 Peter_1 5
5 Peter_2 2
6 Peter_3 3
7 Tim 7
8 Kevin 0
9 Adam_1 3
10 Adam_2 3
11 Xavier 5

使用上一个问题 Add specific characters to duplicated strings ,我尝试按照那里使用的方法:

Data_I_Want <-  make.unique(Data_I_Have, sep = '_')

但这给了我以下错误:

Error in make.unique(Data_I_Have, sep = "_") : 
'names' must be a character vector

谁能告诉我如何解决这个问题?

谢谢!

最佳答案

make.unique 需要一个向量而不是 data.frame,并且默认情况下输出将附加 1、2、3 和 (如 sep 仅来自重复值,而不是从一开始。即

> make.unique(Data_I_Have$Person)
[1] "John" "John.1" "John.2" "Peter" "Peter.1" "Peter.2" "Tim" "Kevin" "Adam" "Adam.1" "Xavier"

如果我们想获得所需的输出,请按“Person”分组,然后将 row_number() 与 group 列连接,然后 ungroup() 它。

library(dplyr)
library(stringr)
Data_I_Have %>%
group_by(Person) %>%
mutate(Person = case_when(n() > 1 ~
str_c(Person, "_", row_number()), TRUE ~ Person)) %>%
ungroup()

-输出

# A tibble: 11 x 2
Person Number_of_Kids
<chr> <chr>
1 John_1 4
2 John_2 1
3 John_3 1
4 Peter_1 5
5 Peter_2 2
6 Peter_3 3
7 Tim 7
8 Kevin 0
9 Adam_1 3
10 Adam_2 3
11 Xavier 5

关于r - "Marking"R 中的重复项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69198793/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com