gpt4 book ai didi

python - 通配Snakemake规则的预处理

转载 作者:行者123 更新时间:2023-12-04 13:25:52 27 4
gpt4 key购买 nike

我有一个 Snakemake 食谱,其中包含一个非常昂贵的准备步骤,所有调用都很常见。这是一个用于演示的伪规则:

rule sample:
input:
"{name}.config"
output:
"{name}.npz"
run:
import somemodule

data = somemodule.Loader("some_big_data") # expensive
np.savez(output, data.process(input)) # also expensive
目前 data为每个目标从头加载,这是非常不理想的。我怎样才能让它只加载一次?
我寻找允许重写规则的东西:
rule sample:
input:
"{name}.config"
output:
"{name}.npz"
setup:
import somemodule

data = somemodule.Loader("some_big_data") # expensive
run:
np.savez(output, data.process(input)) # also expensive
或者:
rule sample:
input:
"{name}.config"
output:
"{name}.npz"
run:
import somemodule

data = somemodule.Loader("some_big_data") # expensive

for job in jobs:
np.savez(job.output,
data.process(job.input)) # also expensive
In another question I have described the code Loader.__init__() is based on .

最佳答案

一种可能的解决方案是使用感兴趣的数据创建腌制对象。请研究 security considerations使用腌制物体检查它是否适合您的情况。如果是,那么它将沿着以下几行:

rule sample:
input:
"{name}.config"
output:
pickle = "{name}.pickle",
run:
import somemodule
import pickle

data = somemodule.Loader("some_big_data") # expensive
pickle.dump(pickle, output.pickle)
在下游规则中,您可以像引用任何其他文件一样引用腌制文件,只需确保使用 pickle.load 加载它即可。 .

关于python - 通配Snakemake规则的预处理,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68622282/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com