gpt4 book ai didi

r - 如何通过循环中的名称列表进行 gsub

转载 作者:行者123 更新时间:2023-12-04 08:35:56 29 4
gpt4 key购买 nike

我有一批样本要提交给我的大学集群进行处理。我有超过 1000 个样本需要运行。不必手动创建脚本,我想知道我可以制作一个 for 循环来替换示例 ID。每个脚本本质上是相同的,我只需要更改示例 ID 和文件的位置。

df <- structure(list(V1 = c("#!/bin/bash", "#BSUB -W 1440", "#BSUB -n 16", 
"#BSUB -x", "#BSUB -R \"rusage[mem=4000] span[hosts=1]\"", "#BSUB -o /gpfs_common/share01/files/abc123.out.%J.txt",
"#BSUB -e /gpfs_common/share01/files/abc123.err.%J.txt", "",
"", "", "", "mcli cp def456/abc123 /panfs/roc/groups/0/location/data.base",
"gzip /panfs/roc/groups/0/location/data.base/abc123", "mcli mv /panfs/roc/groups/0/location/data.base/abc123.gz def456/",
"", "", "#BSUB -J abc123", "\t\t\t", "", "", "", "", "")), row.names = c(NA,
-23L), class = c("data.table", "data.frame"))

names <- list(V1 = c("D00268.merged.dedup.realn.haplotypecaller.g.vcf",
"D00316.merged.dedup.realn.haplotypecaller.g.vcf", "D00426.merged.dedup.realn.haplotypecaller.g.vcf",
"D00432.merged.dedup.realn.haplotypecaller.g.vcf", "D00474.merged.dedup.realn.haplotypecaller.g.vcf",
"D00510.merged.dedup.realn.haplotypecaller.g.vcf", "D00574.merged.dedup.realn.haplotypecaller.g.vcf",
"D00607.merged.dedup.realn.haplotypecaller.g.vcf", "D00619.merged.dedup.realn.haplotypecaller.g.vcf",
"D00662.merged.dedup.realn.haplotypecaller.g.vcf"))

locations <- list(V1 = c("s3/lab/wgs/yrkt/D00268/gvcf/", "s3/lab/wgs/dach/D00316/gvcf/",
"s3/lab/wgs/mnpd/D00426/gvcf/", "s3/lab/wgs/yrkt/D00432/gvcf/",
"s3/lab/wgs/ckcs/D00474/gvcf/", "s3/lab/wgs/lbrt/D00510/gvcf/",
"s3/lab/wgs/shlt/D00574/gvcf/", "s3/lab/wgs/shlt/D00607/gvcf/",
"s3/lab/wgs/mnsc/D00619/gvcf/", "s3/lab/wgs/gtdn/D00662/gvcf/"
))
所以 df 只是一个主脚本,我想通过它运行 for 循环。我在主脚本中将样本名称更改为“abc123”,将样本位置更改为“def456”,这样我就可以使用 gsub 之类的东西来识别这两种模式并将它们替换为样本 ID 和样本位置。我希望在完成后创建一个看起来像这样的文本文件。
#!/bin/bash
#BSUB -W 1440
#BSUB -n 16
#BSUB -x
#BSUB -R "rusage[mem=4000] span[hosts=1]"
#BSUB -o /gpfs_common/share01/files/D00268.merged.dedup.realn.haplotypecaller.g.vcf.out.%J.txt
#BSUB -e /gpfs_common/share01/files/D00268.merged.dedup.realn.haplotypecaller.g.vcf.err.%J.txt




mcli cp s3/lab/wgs/yrkt/D00268/gvcf/D00268.merged.dedup.realn.haplotypecaller.g.vcf /panfs/roc/groups/0/location/data.base
gzip /panfs/roc/groups/0/location/data.base/D00268.merged.dedup.realn.haplotypecaller.g.vcf
mcli mv /panfs/roc/groups/0/location/data.base/D00268.merged.dedup.realn.haplotypecaller.g.vcf.gz s3/lab/wgs/yrkt/D00268/gvcf/


#BSUB -J D00268.merged.dedup.realn.haplotypecaller.g.vcf

我认为 for 循环将是这里最简单的事情,但我愿意接受建议。希望这一切都是有道理的。如果您有任何问题,请告诉我
我过去使用过这个 for 循环,但我从未使用过 for 循环通过列表 gsub
for(i in 1:nrow(df)){
df[i,'V1'] <- gsub("abc123", "D00268.merged.dedup.realn.haplotypecaller.g.vcf", df[i,'V1'])
df[i,'V1'] <- gsub("def456", "s3/lab/wgs/yrkt/D00268/gvcf/", df[i,'V1'])

}

最佳答案

要坚持 for 循环的想法并修改您建议的方法,您可以执行以下操作:

for(i in 1:length(locations[[1]])){

df2 <- df
df2[,'V1'] <- gsub("abc123", names[['V1']][i], df2[,'V1'])
df2[,'V1'] <- gsub("def456", locations[['V1']][i], df2[,'V1'])
fileConn<-file(paste0("script_", i, ".sh" ))
writeLines(df2$V1, fileConn)
close(fileConn)

}

关于r - 如何通过循环中的名称列表进行 gsub,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64795560/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com