gpt4 book ai didi

r - 从 R 中的数字字符串创建各个数字的列总计

转载 作者:行者123 更新时间:2023-12-02 18:11:45 24 4
gpt4 key购买 nike

我有一个包含 50 种冥想技巧的数据框,并且可以将它们分为 4 个不同的类别。每个人都可以将一种技术分为多个类别。我想计算每次为特定类别(1、2、3 或 4)选择一种技术的出现次数,并创建一个将这些相加的新数据框。

当前数据帧示例:

<表类=“s-表”><标题>Techniques_1Techniques_2Techniques_3Techniques_4Techniques_5Techniques_6 <正文>1,3242,3112,3342,31122331,21,2,313312222,32,31,2,311,2

所需的新数据框:

<表类=“s-表”><标题>Category_CountTechnique_1Technique_2Technique_3Technique_4Technique_5Technique_6 <正文>类别_1200244类别_2331323Categroy_3233401类别_4002000
    Here is my data set, the counts are different from my example as they 
were made up. Had a go at using dput, hopefully this is correct:

Med_Tech_Structure <- structure(list(Techniques_1 = c("2", "2", "2",
NA, "1", "1"),
Techniques_2 = c("2", "2", "2", NA, "1,2", "2"), Techniques_3 =
c("2",
"2", "2", NA, "1", "1"), Techniques_4 = c("2", "2", "2",
NA, "1", "2"), Techniques_5 = c("2,3", "4", "3", NA, "4",
"1"), Techniques_6 = c("3", "3", "3", NA, "4", "3"), Techniques_7 =
c("2",
"2", "2", NA, "1", "1"), Techniques_8 = c("1", "4", "1,3",
NA, "4", "2"), Techniques_9 = c("2", "2", "2", NA, "2", "2"
), Techniques_10 = c("1", "4", "1", NA, "2", "1"), Techniques_11 =
c("2",
"4", "1,2", NA, "4", "4"), Techniques_12 = c("1", "4", "2",
NA, "4", "4"), Techniques_13 = c("2,3", "4", "1,2", NA, "4",
"4"), Techniques_14 = c("2", "4", "1,2", NA, "2", "4"), Techniques_15
= c("2",
"4", "2,3", NA, "4", "4"), Techniques_16 = c("1", "1", "3",
NA, "1", "1"), Techniques_17 = c("3", "3", "3", NA, "4",
"4"), Techniques_18 = c("2", "4", "3", NA, "1", "1"), Techniques_19 =
c("2",
"2", "2", NA, "2", "2"), Techniques_20 = c("2", "4", "1",
NA, "4", "4"), Techniques_21 = c("1,2", "4", "1,2", NA, "4",
"4"), Techniques_22 = c("1,2", "1", "1,2", NA, "2", "4"),
Techniques_23 = c("2", "4", "2", NA, "2", "4"), Techniques_24 =
c("2",
"2", "2", NA, "2", "2"), Techniques_25 = c("2", "4", "2",
NA, "4", "4"), Techniques_26 = c("1,2", "4", "1,3", NA, "4",
"3"), Techniques_27 = c("2", "4", "3", NA, "1", "1,2"), Techniques_28
= c("2",
"4", "2", NA, "3", "4"), Techniques_29 = c("1,2", "4", "1,2",
NA, "4", "4"), Techniques_30 = c("1,2", "4", "2", NA, "4",
"4"), Techniques_31 = c("2", "4", "2,3", NA, "1,2", "1"),
Techniques_32 = c("2,3", "4", "1,3", NA, "4", "4"), Techniques_33 =
c("1,2",
"4", "1,2", NA, "4", "4"), Techniques_34 = c("1,2", "4",
"2,3", NA, "4", "4"), Techniques_35 = c("1,2", "4", "2,3",
NA, "4", "4"), Techniques_36 = c("2", "4", "2,3", NA, "4",
"4"), Techniques_37 = c("1,2", "4", "2,3", NA, "4", "4"),
Techniques_38 = c("1", "1", "2", NA, "1", "1"), Techniques_39 =
c("1,2",
"4", "2", NA, "1", "4"), Techniques_40 = c("1,2", "4", "2",
NA, "4", "4"), Techniques_41 = c("1,2", "4", "2", NA, "4",
"4"), Techniques_42 = c("1,2", "4", "1,2", NA, "4", "4"),
Techniques_43 = c("1,2,3", "4", "1,2", NA, "4", "4"), Techniques_44 =
c("1,2,3",
"3", "1,2,3", NA, "4", "4"), Techniques_45 = c("1,2", "1",
"2,3", NA, "4", "4"), Techniques_46 = c("1", "4", "1,2",
NA, "1,2", "4"), Techniques_47 = c("2", "4", "2", NA, "4",
"4"), Techniques_48 = c("1", "2", "2", NA, "1", "2"), Techniques_49 =
c("2",
"4", "2", NA, "1", "4"), Techniques_50 = c("1,2", "4", "2,3",
NA, "4", "4")), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
    CurrentFrame <- 
data.frame(Techniques_1 = c("1,3", "2,3", "2", "1", "2"),
Techniques_2 = c("2", "3", "2", "3", "2,3"),
Techniques_3 = c("4", "4", "3", "3", "2,3"),
Techniques_4 = c("2,3", "2,3", "3", "1", "1,2,3"),
Techniques_5 = c("1", "1", "1,2", "2", "1"),
Techniques_6 = c("1", "1", "1,2,3", "2", "1,2"),
stringsAsFactors = FALSE)

dput(CurrentFrame)

来自 dput 的控制台输出

    structure(list(Techniques_1 = c("1,3", "2,3", "2", "1", "2"), 
Techniques_2 = c("2", "3", "2", "3", "2,3"), Techniques_3 = c("4",
"4", "3", "3", "2,3"), Techniques_4 = c("2,3", "2,3", "3",
"1", "1,2,3"), Techniques_5 = c("1", "1", "1,2", "2", "1"
), Techniques_6 = c("1", "1", "1,2,3", "2", "1,2")), class =
"data.frame", row.names = c(NA,
-5L))

最佳答案

data.table 方法

library(data.table)
# convert to data.table
setDT(CurrentFrame)
# melt to long format
DT <- melt(CurrentFrame, measure.vars = names(CurrentFrame))
# split comma separated values to new row
DT <- DT[, .(value = unlist(strsplit(value, ","))), by = "variable"]
# cast to wide again, use langth as aggregate function
dcast(DT, paste0("Category_", value) ~ variable, value.var = "value", fun.aggregate = length)
# value Techniques_1 Techniques_2 Techniques_3 Techniques_4 Techniques_5 Techniques_6
# 1: Category_1 2 0 0 2 4 4
# 2: Category_2 3 3 1 3 2 3
# 3: Category_3 2 3 3 4 0 1
# 4: Category_4 0 0 2 0 0 0

关于r - 从 R 中的数字字符串创建各个数字的列总计,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72213971/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com