gpt4 book ai didi

r - 堆栈数据(可能是 pivot_longer)但很复杂,R

转载 作者:行者123 更新时间:2023-12-05 02:32:44 27 4
gpt4 key购买 nike

我有这样的数据:

df<-structure(list(record_id = c(1, 2, 4), alcohol = c(1, 2, 1), 
ethnicity = c(1, 1, 1), bilateral_vs_unilateral = c(1, 2,
2), fat_grafting = c(1, 1, 0), number_of_adm_sheets_used = c(1,
NA, NA), number_of_adm_sheets_used_2 = c(1, 1, 1), number_of_fills = c(7,
NA, NA), number_of_fills_2 = c(7, NA, 2), total_fill_volume_ml_left = c(240,
NA, NA), total_volume_ml = c(240, 300, 550), implant_size_l = c(NA_real_,
NA_real_, NA_real_), implant_size_l_2 = c(NA_real_, NA_real_,
NA_real_)), row.names = c(NA, -3L), class = c("tbl_df", "tbl",
"data.frame"))

这是关于患者的信息,每行代表一位接受过乳房手术的患者。

我想将其更改为代表(两个)特定乳房的每一行。有几个变量,从“number_of_adm_sheets_used”到“implant_size_l_2”,每边都有一列。我想改变它们来代表任何一个。例如,“number_of_adm_sheets_used”代表左侧,“number_of_adm_sheets_used_2”代表右侧。我想将它们组合成一列供两边使用的工作表。

我的预期输出如下:

enter image description here

后- enter image description here

我认为它是 pivot_longer 的一些变体,但我在几个方面遇到了问题:

  • 真实数据有68列
  • 如果“bilateral_vs_unilateral”列为“1”(表示双边),我只需要一个重复行
  • 我以前使用 pivot_longer 的方式,你会说“列”并选择一个大范围,我不确定如何堆叠成对的列,如果这有意义的话。
  • 幸运的是,尽管还有 68 个其他列,但所有“问题”列都显示在下方。将“number_of_adm_sheets_used”与“number_of_adm_sheets_used_2”配对'number_of_fills' 和 'number_of_fills_2''total_fill_volume_ml_left' 和 'total_volume_ml'和“implant_size_1”与“implant_size_1_2”

谢谢

最佳答案

如果我对问题的理解正确的话,这是一种可能性。

# Make long format
df.long <- df %>%
pivot_longer(cols = -record_id) %>%
mutate(subject = ifelse(str_sub(name, -2, -1) == "_2", "breast 2", NA),
name = str_remove(name, "_2")) %>%
group_by(record_id, name) %>%
mutate(subject = case_when(
subject == "breast 2" ~ subject,
n() == 2 ~ "breast 1",
n() == 1 ~ "patient"
)) %>%
ungroup()

# statistics regarding the patient
patient <- df.long %>%
filter(subject == "patient") %>%
pivot_wider(names_from = name, values_from = value) %>%
select(-subject)
# statistics regarding each breast
breasts <- df.long %>%
filter(str_detect(subject, "breast")) %>%
pivot_wider(names_from = name, values_from = value)

# merge the two data.frames
patient %>%
inner_join(breasts) %>%
select(record_id, subject, everything())

关于r - 堆栈数据(可能是 pivot_longer)但很复杂,R,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71163410/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com