gpt4 book ai didi

r - 找到重叠区域并提取各自的值

转载 作者:行者123 更新时间:2023-12-02 06:12:59 25 4
gpt4 key购买 nike

如何找到重叠坐标并提取重叠区域各自的 seg.mean 值?

data1
Rl pValue chr start end CNA
2 2.594433 6 129740000 129780000 gain
2 3.941399 6 130080000 130380000 gain
1 1.992114 10 80900000 81100000 gain
1 7.175750 16 44780000 44920000 gain

数据2

ID     chrom   loc.start   loc.end   num.mark  seg.mean
8410 6 129750000 129760000 8430 0.0039
8410 10 80907000 81000000 5 -1.7738
8410 16 44790000 44910000 12 0.0110

数据输出

  Rl       pValue     chr  start    end        CNA    seg.mean
2 2.594433 6 129750000 129760000 gain 0.0039
1 1.992114 10 80907000 81000000 gain -1.7738
1 7.175750 16 44790000 44910000 gain 0.0110

最佳答案

正如@Roland正确建议的那样,这是一个可能的data.table::foverlaps解决方案

library(data.table)
setDT(data1) ; setDT(data2) # Convert data sets to data.table objects
setnames(data2, c("loc.start", "loc.end"), c("start", "end")) # Rename columns so they will match in both sets
setkey(data2, start, end) # key the smaller data so foverlaps will work
foverlaps(data1, data2, nomatch = 0L)[, 1:5 := NULL][] # run foverlaps and remove the unnecessary columns
# seg.mean Rl pValue chr i.start i.end CNA
# 1: 0.0039 2 2.594433 6 129740000 129780000 gain
# 2: -1.7738 1 1.992114 10 80900000 81100000 gain
# 3: 0.0110 1 7.175750 16 44780000 44920000 gain

或者

indx <- foverlaps(data1, data2, nomatch = 0L, which = TRUE) # run foverlaps in order to find indexes using `which`
data1[indx$xid][, seg.mean := data2[indx$yid]$seg.mean][] # update matches
# Rl pValue chr start end CNA seg.mean
# 1: 2 2.594433 6 129740000 129780000 gain 0.0039
# 2: 1 1.992114 10 80900000 81100000 gain -1.7738
# 3: 1 7.175750 16 44780000 44920000 gain 0.0110

关于r - 找到重叠区域并提取各自的值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29648127/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com