gpt4 book ai didi

r - 如果一列中的值重复,如何将另一列的某些值提取到新列中?

转载 作者:行者123 更新时间:2023-12-02 02:46:56 25 4
gpt4 key购买 nike

对糟糕的标题措辞表示歉意。我有一些如下所示的数据(按 id 分组),其中“问题”列包含多次重复:

enter image description here

structure(list(study_id = c("02ipnnqgeovkrxz", "02ipnnqgeovkrxz", 
"02ipnnqgeovkrxz", "02ipnnqgeovkrxz", "02ipnnqgeovkrxz", "02ipnnqgeovkrxz",
"0bsilzm5iabdnoj", "0bsilzm5iabdnoj", "0bsilzm5iabdnoj", "0bsilzm5iabdnoj",
"0bsilzm5iabdnoj", "0bsilzm5iabdnoj", "1171bwmljjct6me", "1171bwmljjct6me",
"1171bwmljjct6me", "1171bwmljjct6me", "1171bwmljjct6me", "1171bwmljjct6me"
), question = c("37tlJa09k7zwKFL ", "37tlJa09k7zwKFL", "3WTpbAzIQmbnlpb ",
"3WTpbAzIQmbnlpb", "3eEVJgaAP6c9FPL ", "3eEVJgaAP6c9FPL", "7QhOyTdA1MjKmX3 ",
"7QhOyTdA1MjKmX3", "8eMvvNHEh1CAqk5 ", "8eMvvNHEh1CAqk5", "e3u9ZmoNISb0vfn ",
"e3u9ZmoNISb0vfn", "3IDmpN1FZDQqhcF ", "3IDmpN1FZDQqhcF", "3WRNXeyBSwuXvh3 ",
"3WRNXeyBSwuXvh3", "6QnjC0CHjV1kmvX ", "6QnjC0CHjV1kmvX"), response = c("0.839",
"word", "0.739", "word", "1.353", "picture", "1.418", "word",
"1.563", "word", "6.377", "word", "1.795", "picture", "1.876",
"picture", "0.96", "picture")), row.names = c(NA, -18L), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), groups = structure(list(study_id = c("02ipnnqgeovkrxz",
"02ipnnqgeovkrxz", "02ipnnqgeovkrxz", "02ipnnqgeovkrxz", "02ipnnqgeovkrxz",
"02ipnnqgeovkrxz", "0bsilzm5iabdnoj", "0bsilzm5iabdnoj", "0bsilzm5iabdnoj",
"0bsilzm5iabdnoj", "0bsilzm5iabdnoj", "0bsilzm5iabdnoj", "1171bwmljjct6me",
"1171bwmljjct6me", "1171bwmljjct6me", "1171bwmljjct6me", "1171bwmljjct6me",
"1171bwmljjct6me"), question = c("37tlJa09k7zwKFL", "37tlJa09k7zwKFL ",
"3eEVJgaAP6c9FPL", "3eEVJgaAP6c9FPL ", "3WTpbAzIQmbnlpb", "3WTpbAzIQmbnlpb ",
"7QhOyTdA1MjKmX3", "7QhOyTdA1MjKmX3 ", "8eMvvNHEh1CAqk5", "8eMvvNHEh1CAqk5 ",
"e3u9ZmoNISb0vfn", "e3u9ZmoNISb0vfn ", "3IDmpN1FZDQqhcF", "3IDmpN1FZDQqhcF ",
"3WRNXeyBSwuXvh3", "3WRNXeyBSwuXvh3 ", "6QnjC0CHjV1kmvX", "6QnjC0CHjV1kmvX "
), .rows = list(2L, 1L, 6L, 5L, 4L, 3L, 8L, 7L, 10L, 9L, 12L,
11L, 14L, 13L, 16L, 15L, 18L, 17L)), row.names = c(NA, -18L
), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE))

我正在尝试重新格式化数据,以便在每个分组 ID 内,“问题”列的每一行都是唯一的。对同一问题的多个回答被分成另一列:

enter image description here

“问题”列代表参与者看到的唯一项目,并且不应在 id 内重复(因为受试者只看到每个项目一次)。响应列代表他们对该项目(图片/文字)的响应 - 但现在他们的 react 时间也集中到此列中。我基本上是想获取 react 时间并将它们放入一个新列中(仍然与适当的 ID 和问题相对应)。

一个 tidyverse 解决方案会很棒,但任何指导将不胜感激!我尝试了“传播”/“总结”的几种变体,但似乎无法正确使用。

最佳答案

尝试这个基本解决方案:

#Data manipulation
df$study_id <- trimws(df$study_id)
df$question <- trimws(df$question)
df$response <- trimws(df$response)
df$Index <- as.numeric(df$response)
df$Index2 <- ifelse(is.na(df$Index),'response','rt')
df$Index <- NULL
df <- as.data.frame(df)
#Reshape
DataG <- reshape(df, idvar=c('study_id','question'), timevar='Index2', direction="wide")
DataG <- DataG[,c(1,2,4,3)]
rownames(DataG)<-NULL

study_id question response.response response.rt
1 02ipnnqgeovkrxz 37tlJa09k7zwKFL word 0.839
2 02ipnnqgeovkrxz 3WTpbAzIQmbnlpb word 0.739
3 02ipnnqgeovkrxz 3eEVJgaAP6c9FPL picture 1.353
4 0bsilzm5iabdnoj 7QhOyTdA1MjKmX3 word 1.418
5 0bsilzm5iabdnoj 8eMvvNHEh1CAqk5 word 1.563
6 0bsilzm5iabdnoj e3u9ZmoNISb0vfn word 6.377
7 1171bwmljjct6me 3IDmpN1FZDQqhcF picture 1.795
8 1171bwmljjct6me 3WRNXeyBSwuXvh3 picture 1.876
9 1171bwmljjct6me 6QnjC0CHjV1kmvX picture 0.96

关于r - 如果一列中的值重复,如何将另一列的某些值提取到新列中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62638601/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com