gpt4 book ai didi

python - 如何在 snakemake 的扩展函数参数中使用通配符?

转载 作者:行者123 更新时间:2023-12-05 05:46:11 24 4
gpt4 key购买 nike

我有一个像这样的 json 文件:

{
"foo": {
"bar1":
{"A1": {"name": "A1", "path": "/path/to/A1"},
"B1": {"name": "B1", "path": "/path/to/B1"},
"C1": {"name": "C1", "path": "/path/to/C1"},
"D1": {"name": "D1", "path": "/path/to/D1"}},
"bar2":
{"A2": {"name": "A2", "path": "/path/to/A2"},
"B2": {"name": "B2", "path": "/path/to/B2"},
"C2": {"name": "C2", "path": "/path/to/C2"},
"D2": {"name": "D2", "path": "/path/to/D2"}}}
}

我正在尝试分别对样本集“bar1”和“bar2”中的样本运行我的 snakemake 管道,将结果放入它们自己的文件夹中。当我扩展我的通配符时,我不想要样本集和样本的所有迭代,我只想要它们在它们的特定组中,就像这样:

tmp/bar1/A1.bam
tmp/bar1/B1.bam
tmp/bar1/C1.bam
tmp/bar1/D1.bam
tmp/bar2/A2.bam
tmp/bar2/B2.bam
tmp/bar2/C2.bam
tmp/bar2/D2.bam

希望我的 snakefile 能帮助解释。我试过这样设置 snakefile:

sample_sets = [ i for i in config['foo'] ]

samples_dict = config['foo'] #cleans it up

def get_samples(wildcards):
return list(samples_dict[wildcards.sample_set].keys())

rule all:
input:
expand(expand("tmp/{{sample_set}}/{sample}.bam", sample = get_samples), sample_set = sample_sets),

这不起作用,我的文件名以“ ”结尾!我也试过:

rule all:
input:
expand(expand("tmp/{{sample_set}}/{sample}.bam", sample = list(samples_dict["{{sample_set}}"].keys()), sample_set = sample_sets),

但这是一个 KeyError。也试过这个:

rule all:
input:
[ ["tmp/{{sample_set}}/{sample}.aligned_bam.core.bam".format( sample = sample ) for sample in list(samples_dict[sample_set].keys())] for sample_set in sample_sets ]

它得到一个“无法从输出文件确定输入文件中的通配符:'sample_set'”错误。

我觉得一定有一种简单的方法可以做到这一点,也许我是个白痴。

任何帮助将不胜感激!如果我遗漏了一些细节,请告诉我。

最佳答案

有可能使用 custom combinatoric function in expand .大多数情况下,此函数是 zip,但是,在您的情况下,嵌套字典形状将需要设计一个自定义函数。相反,更简单的解决方案是使用 Python 构建所需文件的列表。

d = {
"foo": {
"bar1": {
"A1": {"name": "A1", "path": "/path/to/A1"},
"B1": {"name": "B1", "path": "/path/to/B1"},
"C1": {"name": "C1", "path": "/path/to/C1"},
"D1": {"name": "D1", "path": "/path/to/D1"},
},
"bar2": {
"A2": {"name": "A2", "path": "/path/to/A2"},
"B2": {"name": "B2", "path": "/path/to/B2"},
"C2": {"name": "C2", "path": "/path/to/C2"},
"D2": {"name": "D2", "path": "/path/to/D2"},
},
}
}

list_files = []

for key in d["foo"]:
for nested_key in d["foo"][key]:
_tmp = f"tmp/{key}/{nested_key}.bam"
list_files.append(_tmp)

print(*list_files, sep="\n")
#tmp/bar1/A1.bam
#tmp/bar1/B1.bam
#tmp/bar1/C1.bam
#tmp/bar1/D1.bam
#tmp/bar2/A2.bam
#tmp/bar2/B2.bam
#tmp/bar2/C2.bam
#tmp/bar2/D2.bam

关于python - 如何在 snakemake 的扩展函数参数中使用通配符?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71221124/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com