gpt4 book ai didi

r - 如何折叠频率表的行以将其计数添加到新列中?

转载 作者:行者123 更新时间:2023-12-04 10:07:53 24 4
gpt4 key购买 nike

我有一个带有样本分类的数据框:

 Seq_ID   Family Father   Mother   Sex    Role    Type  
<chr> <dbl> <chr> <chr> <chr> <chr> <chr>
1 SSC02219 11000. 0 0 Male Father Parent
2 SSC02217 11000. 0 0 Female Mother Parent
3 SSC02254 11000. SSC02219 SSC02217 Male Proband Child
4 SSC02220 11000. SSC02219 SSC02217 Female Sibling Child
5 SSC02184 11001. 0 0 Male Father Parent
6 SSC02181 11001. 0 0 Female Mother Parent
7 SSC02178 11001. SSC02184 SSC02181 Male Proband Child
8 SSC03092 11002. 0 0 Male Father Parent
9 SSC03078 11002. 0 0 Female Mother Parent
10 SSC03070 11002. SSC03092 SSC03078 Female Proband Child

目前,要从 a 转到 b,我必须这样做:
library(tidyverse)
library(janitor)

sample.df %>% tabyl(Role, Sex) %>%
adorn_totals(where=c("row", "col") ) %>%
as.tibble() %>% select(1,4,3,2) %>%
# Part 2
mutate(type=c("parent", "parent", "child", "child", " ")) %>%
inner_join(., group_by(., type) %>%
summarise(total=sum(Total))) %>%
select(5,6,1,2,3,4)

我觉得这是一个非常简单的解决方法。有没有更直接的方法在 dplyr 中做第二部分?

一种
enter image description here

b enter image description here

最佳答案

这是一个选项。 as.tibble没有必要。 mutatecase_when当您有很多类(class)要分配给“ parent ”或“ child ”时,更易于管理。 inner_join不需要,因为我们可以使用 group_bymutate计算 total .最后,我喜欢在使用 select 时写下列名。功能,因为将来我会更容易阅读,但是您当然可以使用列索引,只要您确信无论您在管道操作中包含什么新分析,列索引都不会改变。

library(tidyverse)
library(janitor)

sample.df %>%
tabyl(Role, Sex) %>%
adorn_totals(where=c("row", "col")) %>%
select(Role, Total, Male, Female) %>%
# Part 2
mutate(type = case_when(
Role %in% c("Mother", "Father") ~"parent",
Role %in% c("Proband", "Sibling") ~"child",
TRUE ~" "
)) %>%
group_by(type) %>%
mutate(total = sum(Total)) %>%
ungroup() %>%
select(type, total, Role, Total, Male, Female)
# # A tibble: 5 x 6
# type total Role Total Male Female
# <chr> <dbl> <chr> <dbl> <dbl> <dbl>
# 1 parent 6. Father 3. 3. 0.
# 2 parent 6. Mother 3. 0. 3.
# 3 child 4. Proband 3. 2. 1.
# 4 child 4. Sibling 1. 0. 1.
# 5 " " 10. Total 10. 5. 5.

数据
library(tidyverse)
library(janitor)

sample.df <- read.table(text = "Seq_ID Family Father Mother Sex Role Type
1 SSC02219 11000 0 0 Male Father Parent
2 SSC02217 11000 0 0 Female Mother Parent
3 SSC02254 11000 SSC02219 SSC02217 Male Proband Child
4 SSC02220 11000 SSC02219 SSC02217 Female Sibling Child
5 SSC02184 11001 0 0 Male Father Parent
6 SSC02181 11001 0 0 Female Mother Parent
7 SSC02178 11001 SSC02184 SSC02181 Male Proband Child
8 SSC03092 11002 0 0 Male Father Parent
9 SSC03078 11002 0 0 Female Mother Parent
10 SSC03070 11002 SSC03092 SSC03078 Female Proband Child ",
header = TRUE, stringsAsFactors = FALSE)

sample.df <- as_tibble(sample.df)

关于r - 如何折叠频率表的行以将其计数添加到新列中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50072102/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com