gpt4 book ai didi

r - 拆分数据框并根据特定列对拆分后的值进行排序

转载 作者:行者123 更新时间:2023-12-04 09:38:47 25 4
gpt4 key购买 nike

我有以下数据框

tdf <- structure(list(GO = c("Cytokine-cytokine receptor interaction", 
"Cytokine-cytokine receptor interaction|Endocytosis", "I-kappaB kinase/NF-kappaB signaling",
"NF-kappa B signaling pathway", "NF-kappaB import into nucleus",
"T cell chemotaxis"), PosCount = c(17, 18, 4, 5, 1, 2), shortgo = structure(c(7L,
7L, 18L, 18L, 18L, 21L), .Label = c("TNF", "adaptive", "alpha",
"apop", "beta", "chemokine", "cytokine", "death", "defense",
"gamma", "immune response", "infla", "interleukin-1 ", "interleukin-10 ",
"interleukin-12 ", "interleukin-18 ", "interleukin-6 ", "kappa",
"migration", "stress", "taxis", "wound"), class = "factor")), .Names = c("GO",
"PosCount", "shortgo"), class = "data.frame", row.names = c(NA,
6L))

看起来像这样:

> tdf
GO PosCount shortgo
1 Cytokine-cytokine receptor interaction 17 cytokine
2 Cytokine-cytokine receptor interaction|Endocytosis 18 cytokine
3 I-kappaB kinase/NF-kappaB signaling 4 kappa
4 NF-kappa B signaling pathway 5 kappa
5 NF-kappaB import into nucleus 1 kappa
6 T cell chemotaxis 2 taxis

我想做的是根据 shortgo 拆分数据框,然后根据 PosCount 对它的 GO 成员进行排序,产生这个 (手工制作):

$cytokine
[1] Cytokine-cytokine receptor interaction|Endocytosis
[2] Cytokine-cytokine receptor interaction

$kappa
[1] NF-kappa B signaling pathway
[2] I-kappaB kinase/NF-kappaB signaling
[3] NF-kappaB import into nucleus

$taxis
[1] T cell chemotaxis

我坚持这个:

> split(tdf$GO,tdf$shortgo)
Error in split.default(tdf$GO, tdf$hsortgo) :
group length is 0 but data length > 0

我该怎么做?

最佳答案

您可以在拆分之前先订购您的数据框:

library(dplyr)
tdf <- tdf %>% group_by(shortgo) %>% arrange(desc(PosCount))

然后拆分:

ldf <- split(tdf$GO, tdf$shortgo, drop=TRUE)

它给出了所需的(有序的)输出:

> ldf
$cytokine
[1] "Cytokine-cytokine receptor interaction|Endocytosis"
[2] "Cytokine-cytokine receptor interaction"

$kappa
[1] "NF-kappa B signaling pathway"
[2] "I-kappaB kinase/NF-kappaB signaling"
[3] "NF-kappaB import into nucleus"

$taxis
[1] "T cell chemotaxis"

当你想在数据帧列表中拆分你的数据帧时,你可以使用:

ldf <- split(tdf, tdf$shortgo, drop=TRUE)

基于 R ( provided by @Henrik in the comments ) 的解决方案:

split(tdf$GO[order(tdf$shortgo, -tdf$PosCount)], tdf$shortgo, drop=TRUE)

关于r - 拆分数据框并根据特定列对拆分后的值进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29913003/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com