gpt4 book ai didi

r - R : how to initialize and add elements to an array in a loop

转载 作者:行者123 更新时间:2023-12-03 08:00:49 24 4
gpt4 key购买 nike

我有一个矩阵,其中包含一列不同的基因,每个基因的每个可能的SNP对应的-log(P-values)列。

因此,矩阵具有3列:Gene_lable,SNP和minus_logpval。我正在尝试编写一个代码,以识别每个基因的-log(P-value)最高的SNP。这是头(数据):

  SNP           Gene_label           minus_logpval
1 rs3934834 HES4/ENSG00000188290 14.1031
2 rs3766193 HES4/ENSG00000188290 7.0203
3 rs3766192 HES4/ENSG00000188290 10.7420
4 rs3766191 HES4/ENSG00000188290 10.4323
5 rs9442371 HES4/ENSG00000188290 10.2941
6 rs9442372 HES4/ENSG00000188290 8.4235

这是代码的开始:
for(i in 1:254360) {
max_pval = 0
if(data$Gene_label[i]==data$Gene_label[i+1]) {
x = array(NA, dim=c(0,2));
x[i] = data$minus_logpval[i];
x[i+1] = data$minus_logpval[i+1];
temp = max(x);
if (temp>max_pval) {
max_pval=temp
line = i
}

但是由于某种原因,R总是给我错误: Error in is.ordered(x) : argument "x" is missing, with no default.我什至没有使用is-ordered(x)函数...我认为错误是我初始化x(应该是数组)的方式,但是我没有不知道如何解决。

最佳答案

ddply完美地使用plyr。将data.frame分成子集(通过Gene_label)并在每个片段上进行操作(找到与snpmax相关的minus_logpval):

##  Reproducible example data
set.seed(1234)
df <- data.frame( Gene_label = rep( letters[1:3] , 3 ) , snp = rep( letters[5:7] , each = 3 ) , minus_logpval = rnorm(9) )
df
# Gene_label snp minus_logpval
#1 a e -1.2070657
#2 b e 0.2774292
#3 c e 1.0844412
#4 a f -2.3456977
#5 b f 0.4291247
#6 c f 0.5060559
#7 a g -0.5747400
#8 b g -0.5466319
#9 c g -0.5644520

## And a single line using 'ddply'
require(plyr)
ddply( df , .(Gene_label) , summarise , SNP = snp[which.max(minus_logpval)] )
# Gene_label SNP
#1 a g
#2 b f
#3 c e

关于r - R : how to initialize and add elements to an array in a loop,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17165396/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com