I am here since I did not across any solution on the internet, yet. I am trying to write a nexflow workflow that basically splits a big table, computes statistics for each split table, and then merges small stats table.
我在这里是因为我还没有在互联网上找到任何解决方案。我正在尝试编写一个nexflow工作流,它基本上拆分一个大表,为每个拆分的表计算统计数据,然后合并小的统计数据表。
I have some trouble with splitting table process. I want to split table while keeping the header intact in smaller ones. Code for the bash is something like this:
我在拆分表格过程中遇到了一些问题。我想拆分表格,同时保持较小的表头不变。狂欢的代码如下所示:
head -n1 '${table2parse}' > header.tsv ## take the header line
tail -n+2 '${table2parse}' | split -l 4 - chunk_ ## split the table w/o headers
for f in chunk_*; do cat header.tsv $f > 'split_table_$f.tsv'; done ## add the header to each chunk
So far this works. However, when I tried to incorporate this into nextflow pipeline:
到目前为止,这种做法奏效了。然而,当我尝试将其合并到NextFlow管道中时:
process splitTable {
input:
path table2parse
output:
path 'split_table_*'
"""
head -n1 '${table2parse}' > header.tsv ## take the header line
tail -n+2 '${table2parse}' | split -l 4 - chunk_ ## split the table w/o headers
for f in chunk_*; do cat header.tsv $f > 'split_table_$f.tsv'; done ## add the header to each chunk
"""
}
I get this error:
我得到了这个错误:
Caused by:
No such variable: f -- Check script 'trial.nf' at line: 16
Apparently nextflow confuses bash variable with its own variables. I tried to use escape character '\f' , establishing it as a nextflow variable, but to no avail.
显然,nextflow混淆了bash变量和它自己的变量。我尝试使用转义字符‘\f’,将其设置为NextFlow变量,但无济于事。
Therefore I am really grateful to anyone with suggestions.
因此,我真的很感谢任何有建议的人。
PS: I recently try to learn dsl2 syntax of the Nextflow, if you have recommendations on that, I am all ears!
PS:我最近试着学习了dsl2的Nextflow语法,如果你对此有建议,我洗耳恭听!
更多回答
优秀答案推荐
Reduce the problem to a Bash script taking the required parameter. Test it independently Nextflow process, then call the script from the Nextflow process
将问题简化为接受所需参数的Bash脚本。独立测试NextFlow进程,然后从NextFlow进程调用脚本
更多回答
我是一名优秀的程序员,十分优秀!