gpt4 book ai didi

python - 如何在 snakemake 管道中使用 pandas

转载 作者:行者123 更新时间:2023-12-05 03:28:53 25 4
gpt4 key购买 nike

我想通过将一些代码转换为数据管道来提高我编写的一些 python 代码的可重现性。我习惯于 R 中的targets,并希望在 Python 中找到一个等价物。我的印象是 snakemake 与此非常接近。

我不明白我们如何使用 pandassnakemake 任务中导入输入,修改它然后编写 output

让我们采用我能想到的最简单的管道:我们采用 csv 并在其他地方写入副本。

管道在使用 bash 脚本时工作正常:

rule trying_snakemake:
input:
path="untitled.txt"
output:
"test-snakemake.csv"
run:
shell("cp {input.path} {output}")

我想对 pandas 使用等效的方法(当然这里使用 pandas 似乎没有必要,但这是为了理解逻辑):

rule trying_snakemake:
input:
path="untitled.txt"
output:
"test-snakemake.csv"
run:
import pandas as pd
df = pd.read_csv({input.path})
df.to_csv({output}, header=False)
snakemake -c1
Invalid file path or buffer object type: <class 'set'>
File "/home/jovyan/work/label-openfood/Snakefile", line 19, in __rule_trying_snakemake
File "/opt/conda/lib/python3.9/site-packages/pandas/util/_decorators.py", line 311, in wrapper
File "/opt/conda/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 586, in read_csv
File "/opt/conda/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 482, in _read
File "/opt/conda/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 811, in __init__
File "/opt/conda/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1040, in _make_engine
File "/opt/conda/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 51, in __init__
File "/opt/conda/lib/python3.9/site-packages/pandas/io/parsers/base_parser.py", line 222, in _open_handles
File "/opt/conda/lib/python3.9/site-packages/pandas/io/common.py", line 609, in get_handle
File "/opt/conda/lib/python3.9/site-packages/pandas/io/common.py", line 396, in _get_filepath_or_buffer
File "/opt/conda/lib/python3.9/concurrent/futures/thread.py", line 52, in run
Exiting because a job execution failed. Look above for error message

我认为错误出现在 read_csv 这一步,但我不明白它是什么意思(我已经习惯了 pandas 工作得很好的情况)

最佳答案

你非常接近,run 指令中不需要花括号:

rule trying_snakemake:
input:
path="untitled.txt"
output:
csv="test-snakemake.csv"
run:
import pandas as pd
df = pd.read_csv(input.path)
df.to_csv(output.csv, header=False)

关于python - 如何在 snakemake 管道中使用 pandas,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71162082/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com