gpt4 book ai didi

r 从数据框中的列创建邻接矩阵

转载 作者:行者123 更新时间:2023-12-04 10:43:12 25 4
gpt4 key购买 nike

我有兴趣测试一些网络可视化技术,但在尝试这些功能之前,我想使用如下数据框构建邻接矩阵(从,到)。

 Id   Gender   Col_Cold_1  Col_Cold_2  Col_Cold_3  Col_Hot_1  Col_Hot_2   Col_Hot_3  
10 F pain sleep NA infection medication walking
14 F Bump NA muscle NA twitching flutter
17 M pain hemoloma Callus infection
18 F muscle pain twitching medication

我的目标是创建一个邻接矩阵如下
1) All values in columns with keyword Cold will contribute to the rows  
2) All values in columns with keyword Hot will contribute to the columns

例如, pain, sleep, Bump, muscle, hemaloma是带有关键字 的列下的单元格值冷它们将形成行和单元格值,例如 infection, medication, Callus, walking, twitching, flutter在带有关键字的列下 热门 这将形成关联矩阵的列。

最终所需的输出应如下所示:
           infection  medication  walking  twitching  flutter  Callus
pain 2 2 1 1 1
sleep 1 1 1
Bump 1 1
muscle 1 1
hemaloma 1 1
  • [pain, infection] = 2 因为疼痛和感染之间的关联在原始数据框中出现两次:一次在第 1 行,一次在第 3 行。
  • [pain, medication] =2 因为疼痛和药物之间的关联在第 1 行和第 4 行中出现两次。

  • 非常感谢有关生成此类关联矩阵的任何建议或建议。

    可重现的数据集
    df = structure(list(id = c(10, 14, 17, 18), Gender = structure(c(1L, 1L, 2L, 1L), .Label = c("F", "M"), class = "factor"), Col_Cold_1 = structure(c(4L, 2L, 1L, 3L), .Label = c("", "Bump", "muscle", "pain"), class = "factor"), Col_Cold_2 = structure(c(4L, 2L, 3L, 1L), .Label = c("", "NA", "pain", "sleep"), class = "factor"), Col_Cold_3 = structure(c(1L, 3L, 2L, 4L), .Label = c("NA", "hemaloma", "muscle", "pain" ), class = "factor"), Col_Hot_1 = structure(c(4L, 3L, 2L, 1L), .Label = c("", "Callus", "NA", "infection"), class = "factor"), Col_Hot_2 = structure(c(2L, 3L, 1L, 3L), .Label = c("infection", "medication", "twitching"), class = "factor"), Col_Hot_3 = structure(c(4L, 2L, 1L, 3L), .Label = c("", "flutter", "medication", "walking" ), class = "factor")), .Names = c("id", "Gender", "Col_Cold_1", "Col_Cold_2", "Col_Cold_3", "Col_Hot_1", "Col_Hot_2", "Col_Hot_3" ), row.names = c(NA, -4L), class = "data.frame")

    最佳答案

    一种方法是将数据集整理成“整齐”的形式,然后使用xtabs .首先,一些清理:

    df[] <- lapply(df, as.character)  # Convert factors to characters
    df[df == "NA" | df == "" | is.na(df)] <- NA # Make all blanks NAs

    现在,整理数据集:
    library(tidyr)
    library(dplyr)
    out <- do.call(rbind, sapply(grep("^Col_Cold", names(df), value = T), function(x){
    vars <- c(x, grep("^Col_Hot", names(df), value = T))
    setNames(gather_(select(df, one_of(vars)),
    key_col = x,
    value_col = "value",
    gather_cols = vars[-1])[, c(1, 3)], c("cold", "hot"))
    }, simplify = FALSE))

    这个想法是将每个“冷”列与每个“热”列“配对”以制作一个长数据集。 out看起来像这样:
    out
    # cold hot
    # 1 pain infection
    # 2 Bump <NA>
    # 3 <NA> Callus
    # 4 muscle <NA>
    # 5 pain medication
    # ...

    最后,使用 xtabs使所需的输出:
    xtabs(~ cold + hot, na.omit(out))
    # hot
    # cold Callus flutter infection medication twitching walking
    # Bump 0 1 0 0 1 0
    # hemaloma 1 0 1 0 0 0
    # muscle 0 1 0 1 2 0
    # pain 1 0 2 2 1 1
    # sleep 0 0 1 1 0 1

    关于r 从数据框中的列创建邻接矩阵,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41214012/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com