gpt4 book ai didi

r - 将两个 sampleID 的相应值连接到一个新的单列中

转载 作者:行者123 更新时间:2023-12-01 12:12:44 24 4
gpt4 key购买 nike

我有一个如下所示的数据框 sampleManifest:

SampleName          Status          Role          Sex
AU056001_00HI1299A unaffected sibling female
AU056002_00HI1301A unaffected proband male
AU0780201_00HI1775A unaffected father male
AU0780202_00HI1777A unaffected mother female
AU0780301_00HI1778A affected proband male
.
.
.

还有一个单独的成对样本比较数据框,kinshipEstimates:

FID    ID1                      ID2             Kinship Relationship    
AU0560 AU056001_00HI1299A AU056002_00HI1301A 0.0283 full-sibling
AU0780 AU0780201_00HI1775A AU0780202_00HI1777A -0.00160 unrelated
AU0780 AU0780201_00HI1775A AU0780301_00HI1778A 0.284 parent-child
AU0780 AU0780202_00HI1777A AU0780301_00HI1778A 0.246 parent-child
.
.
.

我想构建一个新数据框,其中 sampleManifest$Role 用于 kinshipEstimates 每一行中的每一个样本,因此它看起来像这样:

FID    ID1                      ID2             Roles           Kinship Relationship    
AU0560 AU056001_00HI1299A AU056002_00HI1301A sibling-proband 0.0283 full-sibling
AU0780 AU0780201_00HI1775A AU0780202_00HI1777A father-mother -0.00160 unrelated
AU0780 AU0780201_00HI1775A AU0780301_00HI1778A father-proband 0.284 parent-child
AU0780 AU0780202_00HI1777A AU0780301_00HI1778A mother-proband 0.246 parent-child
.
.
.

我一直在尝试使用 left_join,但不知道如何将一对中每个样本的相应 Role 合并为一个值。

最佳答案

一个解决方案是使用 tidyverse 包使用双 left_join。首先在 ID1SampleName 上加入 kinshipEstimatessampleManifest。再次加入 sampleManifest,结果在 ID2SampleName 上。最后,使用tidyr::unite合并Role.xRole.y

library(tidyverse)

left_join(kinshipEstimates, sampleManifest, by=c("ID1" = "SampleName")) %>%
select(-Status, -Sex) %>%
left_join(sampleManifest, by=c("ID2" = "SampleName")) %>%
unite(Roles, Role.x, Role.y, sep="-") %>%
select(-Sex, -Status)


# FID ID1 ID2 Kinship Relationship Roles
# 1 AU0560 AU056001_00HI1299A AU056002_00HI1301A 0.0283 full-sibling sibling-proband
# 2 AU0780 AU0780201_00HI1775A AU0780202_00HI1777A -0.0016 unrelated father-mother
# 3 AU0780 AU0780201_00HI1775A AU0780301_00HI1778A 0.2840 parent-child father-proband
# 4 AU0780 AU0780202_00HI1777A AU0780301_00HI1778A 0.2460 parent-child mother-proband

数据:

sampleManifest <- read.table(text = 
"SampleName Status Role Sex
AU056001_00HI1299A unaffected sibling female
AU056002_00HI1301A unaffected proband male
AU0780201_00HI1775A unaffected father male
AU0780202_00HI1777A unaffected mother female
AU0780301_00HI1778A affected proband male",
stringsAsFactors = FALSE, header = TRUE)

kinshipEstimates <- read.table(text =
"FID ID1 ID2 Kinship Relationship
AU0560 AU056001_00HI1299A AU056002_00HI1301A 0.0283 full-sibling
AU0780 AU0780201_00HI1775A AU0780202_00HI1777A -0.00160 unrelated
AU0780 AU0780201_00HI1775A AU0780301_00HI1778A 0.284 parent-child
AU0780 AU0780202_00HI1777A AU0780301_00HI1778A 0.246 parent-child",
stringsAsFactors = FALSE, header = TRUE)

关于r - 将两个 sampleID 的相应值连接到一个新的单列中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50571304/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com