gpt4 book ai didi

javascript - R 中的 NetworkD3 桑基图 : How to calculate value for each link?

转载 作者:行者123 更新时间:2023-11-28 04:03:39 26 4
gpt4 key购买 nike

我正在尝试按照 d3Network 的 R 端口示例来创建详细说明的桑基图(如下所述: https://christophergandrud.github.io/networkD3/ )。我加载以下示例“Energy”数据集:

    # Load energy projection data

URL <- paste0("https://cdn.rawgit.com/christophergandrud/networkD3/",
"master/JSONdata/energy.json")

Energy <- jsonlite::fromJSON(URL)

导入“Energy”数据集会生成两个新的data.frame;节点和链接。查看链接数据可以发现以下格式:

    head(Energy$links)
source target value
1 0 1 124.729
2 1 2 0.597
3 1 3 26.862
4 1 4 280.322
5 1 5 81.144
6 6 2 35.000

“源”列指示源节点,“目标”列指示目标节点,而“值”列指示每个单独链接的值。

尽管这在概念上相当简单,但我在获取与 Energy$links data.frame 格式相同的数据集时遇到了巨大困难。我已经能够以以下格式获取数据,但对于如何进一步转换它完全一片空白:

   head(sampleSankeyData, n = 10L)
clientID node1
<int> <chr>
1 23969 1 Community Services
2 39199 1 Youth Justice
3 23595 1 Mental Health
4 15867 1 Community Services
5 18295 3 Housing
6 18295 2 Housing
7 18295 1 Community Services
8 18295 4 Housing
9 15253 1 Housing
10 27839 1 Community Services

我想要做的是聚合每个链接的唯一客户端数量。例如,在上述数据子集中,由于客户端 18295,“1 Community Services”到“2 Housing”的链接应具有值 1(“2 Housing”到“3 Housing”的链接也应具有值 1)。 ”以及“3 住房”至“4 住房”)。因此,我希望能够获取与桑基图示例中的 Energy$links 格式相同的数据。

最佳答案

试试这个:

library(tidyverse)
library(stringr)
df <- tribble(
~number, ~clientID, ~node1,
1 , 23969, '1 Community Services',
2 , 39199, '1 Youth Justice',
3 , 23595, '1 Mental Health',
4 , 15867, '1 Community Services',
5 , 18295, '3 Housing',
6 , 18295, '2 Housing',
7 , 18295, '1 Community Services',
8 , 18295, '4 Housing',
9 , 15253, '1 Housing',
10, 27839, '1 Community Services')

df2 <- mutate(df, step=as.numeric(str_sub(node1, end=1))) %>%
spread(step, node1, sep='_') %>%
group_by(clientID) %>%
summarise(step1 = sort(unique(step_1))[1],
step2 = sort(unique(step_2))[1],
step3 = sort(unique(step_3))[1],
step4 = sort(unique(step_4))[1])

df3 <- bind_rows(select(df2,1,source=2,target=3),
select(df2,1,source=3,target=4),
select(df2,1,source=4,target=5)) %>%
group_by(source, target) %>%
summarise(clients=n())

并将其与networkD3一起使用...

links <- df3 %>% 
dplyr::ungroup() %>% # ungroup just to be safe
dplyr::filter(!is.na(source) & !is.na(target)) # remove lines without a link

# build the nodes data frame based on nodes in your links data frame
nodeFactors <- factor(sort(unique(c(links$source, links$target))))
nodes <- data.frame(name = nodeFactors)

# convert the source and target values to the index of the matching node in the
# nodes data frame
links$source <- match(links$source, levels(nodeFactors)) - 1
links$target <- match(links$target, levels(nodeFactors)) - 1

# plot
library(networkD3)
sankeyNetwork(Links = links, Nodes = nodes, Source = 'source',
Target = 'target', Value = 'clients', NodeID = 'name')

关于javascript - R 中的 NetworkD3 桑基图 : How to calculate value for each link?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46880502/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com