gpt4 book ai didi

python - 缺少snakemake中所有规则的输入文件

转载 作者:行者123 更新时间:2023-12-01 00:19:05 25 4
gpt4 key购买 nike

我正在尝试构建一个用于生物合成基因杂波检测的snakemake管道,但正在努力解决错误:

Missing input files for rule all:
antismash-output/Unmap_09/Unmap_09.txt
antismash-output/Unmap_12/Unmap_12.txt
antismash-output/Unmap_18/Unmap_18.txt

等等更多文件。据我所知,snakefile 中的文件生成应该可以正常工作:

    workdir: config["path_to_files"]
wildcard_constraints:
separator = config["separator"],
extension = config["file_extension"],
sample = config["samples"]

rule all:
input:
expand("antismash-output/{sample}/{sample}.txt", sample = config["samples"])

# merging the paired end reads (either fasta or fastq) as prodigal only takes single end reads
rule pear:
input:
forward = "{sample}{separator}1.{extension}",
reverse = "{sample}{separator}2.{extension}"

output:
"merged_reads/{sample}.{extension}"

conda:
"~/miniconda3/envs/antismash"

shell:
"pear -f {input.forward} -r {input.reverse} -o {output} -t 21"

# If single end then move them to merged_reads directory
rule move:
input:
"{sample}.{extension}"

output:
"merged_reads/{sample}.{extension}"

shell:
"cp {path}/{sample}.{extension} {path}/merged_reads/"

# Setting the rule order on the 2 above rules which should be treated equally and only one run.
ruleorder: pear > move
# annotating the metagenome with prodigal#. Can be done inside antiSMASH but prefer to do it out
rule prodigal:
input:
"merged_reads/{sample}.{extension}"

output:
gbk_files = "annotated_reads/{sample}.gbk",
protein_files = "protein_reads/{sample}.faa"

conda:
"~/miniconda3/envs/antismash"

shell:
"prodigal -i {input} -o {output.gbk_files} -a {output.protein_files} -p meta"

# running antiSMASH on the annotated metagenome
rule antiSMASH:
input:
"annotated_reads/{sample}.gbk"

output:
touch("antismash-output/{sample}/{sample}.txt")

conda:
"~/miniconda3/envs/antismash"

shell:
"antismash --knownclusterblast --subclusterblast --full-hmmer --smcog --outputfolder antismash-output/{wildcards.sample}/ {input}"

这是我的 config.yaml 文件的示例:

file_extension: fastq
path_to_files: /home/lamma/ABR/Each_reads
samples:
- Unmap_14
- Unmap_55
- Unmap_37
separator: _

我看不出蛇文件中哪里出了问题而产生了这样的错误。对于这个简单的问题抱歉,我是snakemake的新手。

最佳答案

问题是您设置的全局通配符约束错误:

wildcard_constraints:
separator = config["separator"],
extension = config["file_extension"],
sample = '|'.join(config["samples"]) # <-- this should fix the problem

然后立即出现另一个问题,即extensionseperator 通配符。 Snakemake 只能从其他文件名推断这些应该是什么,你不能通过通配符约束实际设置它们。我们可以使用 f-string 语法来填写值应该是什么:

rule pear:
input:
forward = f"{{sample}}{config['separator']}1.{{extension}}",
reverse = f"{{sample}}{config['separator']}2.{{extension}}"
...

和:

rule prodigal:
input:
f"merged_reads/{{sample}}.{config['file_extension']}"
...

看看snakemake regex如果通配符约束让您感到困惑,如果您对 f"" 语法以及何时使用单 { 和何时使用 double 感到困惑,请查找有关 f 字符串的博客{{ 来逃避它们。

希望有帮助!

关于python - 缺少snakemake中所有规则的输入文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59085429/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com