gpt4 book ai didi

Missing Output Exception Error in Snakemake(Snakemake中缺少输出异常错误)

转载 作者:bug小助手 更新时间:2023-10-24 17:55:11 29 4
gpt4 key购买 nike



I am using snakemake version 7.30.1

我使用的是Snakemake版本7.30.1


I am trying to run my snakemake workflow using snakemake --cores 4. Snakemake seems to be able to locate the input files and seems to start to complete the steps of the first rule in the workflow, but then for some reason exits out with a missingoutputexcpetion error stating it cannot find the output files for the second of the two samples in the samples list. This doesn't seem to be an issue with the files themself since when I switch the order of the files, the new first sample runs and the new second sample doesn't. I have tried changing the latency as well but it didn't help.

我正在尝试使用Snakemake--core 4运行我的Snakemake工作流。Snakemake似乎能够找到输入文件,并似乎开始完成工作流中第一个规则的步骤,但由于某种原因,它退出时出现Missingoutputexpetion错误,指出它找不到Samples列表中两个样本中第二个的输出文件。这似乎不是文件本身的问题,因为当我切换文件顺序时,新的第一个示例运行,而新的第二个示例不运行。我也尝试过更改延迟,但没有帮助。


I am trying to run fastp in my first rule for two samples and two reads. The output should produce the files M31A_150k_1_final.fq, M28B_150k_1_final.fq, M31A_150k_2_final.fq, M28B_150k_2_final.fq:

我试图在我的第一个规则中对两个样本和两个读取运行快速。输出应生成文件M31A_150k_1_final.fq、M28B_150k_1_final.fq、M31A_150k_2_final.fq、M28B_150k_2_final.fq:


base_path = "/Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/"
Define list of sample names

samples = ["M31A_150k" , "M28B_150k"]

rule all:
input:
expand(base_path + "bai/{sample}_all.bam.bai", sample=samples),
expand(base_path + "bai/{sample}_forward.bam.bai", sample=samples),
expand(base_path + "bai/{sample}_reverse.bam.bai", sample=samples),
expand(base_path + "bigwig/{sample}.bw", sample=samples),
expand(base_path + "bigwig/{sample}_forward.bw", sample=samples),
expand(base_path + "bigwig/{sample}_reverse.bw", sample=samples)

rule fastp_adaptors:
input:
R1 = expand(base_path + "testfiles/{sample}_1.fq", sample=samples),
R2 = expand(base_path + "testfiles/{sample}_2.fq", sample=samples)

output:
R1_final = expand(base_path + "trimmed/{sample}_1_final.fq", sample=samples),
R2_final = expand(base_path + "trimmed/{sample}_2_final.fq", sample=samples)

shell:
"""
fastp -w 8 --dont_eval_duplication -i {input.R1} -I {input.R2} -t 10 -F 10 -o {output.R1_final} -O {output.R2_final} --detect_adapter_for_pe
"""


Here is the log of the error I am receiving:

以下是我收到的错误的日志:


valeriaaizen@Valerias-MacBook-Pro \~/D/c/n/snakemake-attempt (main)\> snakemake --cores 4 (myenv_x86)
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 4
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads

all 1 1 1
bowtie2 1 1 1
deeptools_bigwigall 1 1 1
deeptools_bigwigforward 1 1 1
deeptools_bigwigreverse 1 1 1
fastp_adaptors 1 1 1
merge_83163 1 1 1
merge_99147 1 1 1
reverse 1 1 1
samtools_indexall 1 1 1
samtools_indexforward 1 1 1
samtools_sort 1 4 4
samtools_sort147 1 1 1
samtools_sort163 1 1 1
samtools_sort83 1 1 1
samtools_sort99 1 1 1
total 16 1 4

Select jobs to execute...

\[Thu Sep 7 14:39:53 2023\]
rule fastp_adaptors:
input: /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M31A_150k_1.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M28B_150k_1.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M31A_150k_2.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M28B_150k_2.fq
output: /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_1_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_1_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_2_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_2_final.fq
jobid: 4
reason: Missing output files: /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_1_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_2_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_1_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_2_final.fq
resources: tmpdir=/var/folders/4c/h8ky28xj143dkssjycttn5lr0000gn/T

Detecting adapter sequence for read1...

Illumina TruSeq Adapter Read 1
AGATCGGAAGAGCACACGTCTGAACTCCAGTCA

Detecting adapter sequence for read2...
No adapter detected for read2

Read1 before filtering:
total reads: 150000
total bases: 22500000
Q20 bases: 21987079(97.7204%)
Q30 bases: 21372363(94.9883%)

Read2 before filtering:
total reads: 150000
total bases: 22500000
Q20 bases: 21768444(96.7486%)
Q30 bases: 21103172(93.7919%)

Read1 after filtering:
total reads: 136856
total bases: 18856683
Q20 bases: 18594358(98.6088%)
Q30 bases: 18347138(97.2978%)

Read2 after filtering:
total reads: 136856
total bases: 17587532
Q20 bases: 17259790(98.1365%)
Q30 bases: 16852551(95.821%)

Filtering result:
reads passed filter: 273712
reads failed due to low quality: 2162
reads failed due to too many N: 18
reads failed due to too short: 24108
reads with adapter trimmed: 35295
bases trimmed due to adapters: 2204956

Insert size peak (evaluated by paired-end reads): 150

JSON report: fastp.json
HTML report: fastp.html

fastp -w 8 --dont_eval_duplication -i /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M31A_150k_1.fq /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M28B_150k_1.fq -I /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M31A_150k_2.fq /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M28B_150k_2.fq -t 10 -F 10 -o /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_1_final.fq /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_1_final.fq -O /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_2_final.fq /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_2_final.fq --detect_adapter_for_pe
fastp v0.22.0, time used: 8 seconds
Waiting at most 5 seconds for missing files.
MissingOutputException in rule fastp_adaptors in file /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/Snakefile, line 35:
Job 4 completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
/Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_1_final.fq
/Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_2_final.fq
Removing output files of failed job fastp_adaptors since they might be corrupted:
/Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_1_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_2_final.fq
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2023-09-07T143950.741220.snakemake.log

更多回答
优秀答案推荐

rule fastp_adaptors:
input:
R1 = expand(base_path + "testfiles/{sample}_1.fq", sample=samples),
R2 = expand(base_path + "testfiles/{sample}_2.fq", sample=samples)

output:
R1_final = expand(base_path + "trimmed/{sample}_1_final.fq", sample=samples),
R2_final = expand(base_path + "trimmed/{sample}_2_final.fq", sample=samples)

shell:
"""
fastp -w 8 --dont_eval_duplication -i {input.R1} -I {input.R2} -t 10
-F 10 -o {output.R1_final} -O {output.R2_final} --detect_adapter_for_pe
"""

I guess fastp_adaptors has to run once on each pair of fastq files (for a total of two runs in your case). However, since you have expand in your input and output directives fastp_adaptors runs just once on all pairs together causing the error. So try removing the expands in fastp_adaptors. (If you are new to snakemake, this is one of the things that trips beginners)

我猜Fastp_Adaptors必须在每对FASTQ文件上运行一次(在您的例子中总共运行两次)。然而,由于您已经在输入和输出指令中展开,因此Fastp_Adaptors只在所有对上一起运行一次,从而导致错误。因此,尝试删除fast p_Adaptors中的扩展。(如果你是新手,这是初学者会遇到的问题之一)


更多回答

Oh wow it worked! Thank you so much, that makes a lot more sense!

哦,哇,起作用了!太感谢你了,这就更有意义了!

Ok, glad it worked. Can you accept the answer if you are happy with it? So we have the question resolved.

好的,很高兴它起作用了。如果你对答案满意,你能接受吗?所以我们已经解决了这个问题。

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com