Nextflow 操作进程之间的变量

问题描述

我正在重新设计一个工作流程,基本上它从一个产生多个其他流程的流程开始。最初我在开始我的工作流程之前就有了这些变量,所以我制作了这些变量的元组,然后将其作为输入传递给流程。该进程获取每个值,并为元组中的每个值生成一个进程。

但是在我的新架构中,我在 processA 中得到了“元组”。然后 processB 需要将每个值作为输入,并为每个输入生成一个进程。

我的元组看起来像:{"002--002": some_params,"004--004": some_params,etc.}

我目前在 Python 中将这些值作为列表:['052--052','054--054','055--055','059--059','060--060','066--066']

我想知道如何解析这个 Python 列表,以继续传递一个参数并生成多个进程?

ProcessA 还创建诸如 somefile_052--052.someextension 之类的文件 - 我基本上想用正确的文件传递正确的变量。

任何帮助将不胜感激。

这是一些代码

这是我需要操作的文件。我需要发送具有相同代码的所有文件以及变量。

> ls
out.barcoded.subreads.bam             out.subreads.060--060.bam.pbi         out.subreads.090--090.subreadset.xml  out.subreads.149--149.bam             out.subreads.192--192.bam.pbi         out.subreads.249--249.subreadset.xml  out.subreads.285--285.bam             out.subreads.321--321.bam.pbi         out.subreads.479--479.subreadset.xml
out.barcoded.subreads.bam.pbi         out.subreads.060--060.subreadset.xml  out.subreads.091--091.bam             out.subreads.149--149.bam.pbi         out.subreads.192--192.subreadset.xml  out.subreads.252--252.bam             out.subreads.285--285.bam.pbi         out.subreads.321--321.subreadset.xml  out.subreads.482--482.bam
out.barcoded.subreads.lima.counts     out.subreads.066--066.bam             out.subreads.091--091.bam.pbi         out.subreads.149--149.subreadset.xml  out.subreads.227--227.bam             out.subreads.252--252.bam.pbi         out.subreads.285--285.subreadset.xml  out.subreads.454--454.bam             out.subreads.482--482.bam.pbi
out.barcoded.subreads.lima.guess      out.subreads.066--066.bam.pbi         out.subreads.091--091.subreadset.xml  out.subreads.172--172.bam             out.subreads.227--227.bam.pbi         out.subreads.252--252.subreadset.xml  out.subreads.303--303.bam             out.subreads.454--454.bam.pbi         out.subreads.482--482.subreadset.xml
out.barcoded.subreads.lima.report     out.subreads.066--066.subreadset.xml  out.subreads.107--107.bam             out.subreads.172--172.bam.pbi         out.subreads.227--227.subreadset.xml  out.subreads.259--259.bam             out.subreads.303--303.bam.pbi         out.subreads.454--454.subreadset.xml  out.subreads.489--489.bam
out.barcoded.subreads.lima.summary    out.subreads.071--071.bam             out.subreads.107--107.bam.pbi         out.subreads.172--172.subreadset.xml  out.subreads.233--233.bam             out.subreads.259--259.bam.pbi         out.subreads.303--303.subreadset.xml  out.subreads.464--464.bam             out.subreads.489--489.bam.pbi
out.barcoded.subreads.subreadset.xml  out.subreads.071--071.bam.pbi         out.subreads.107--107.subreadset.xml  out.subreads.175--175.bam             out.subreads.233--233.bam.pbi         out.subreads.259--259.subreadset.xml  out.subreads.307--307.bam             out.subreads.464--464.bam.pbi         out.subreads.489--489.subreadset.xml
out.subreads.052--052.bam             out.subreads.071--071.subreadset.xml  out.subreads.112--112.bam             out.subreads.175--175.bam.pbi         out.subreads.233--233.subreadset.xml  out.subreads.261--261.bam             out.subreads.307--307.bam.pbi         out.subreads.464--464.subreadset.xml  out.subreads.494--494.bam
out.subreads.052--052.bam.pbi         out.subreads.082--082.bam             out.subreads.112--112.bam.pbi         out.subreads.175--175.subreadset.xml  out.subreads.235--235.bam             out.subreads.261--261.bam.pbi         out.subreads.307--307.subreadset.xml  out.subreads.468--468.bam             out.subreads.494--494.bam.pbi
out.subreads.052--052.subreadset.xml  out.subreads.082--082.bam.pbi         out.subreads.112--112.subreadset.xml  out.subreads.185--185.bam             out.subreads.235--235.bam.pbi         out.subreads.261--261.subreadset.xml  out.subreads.313--313.bam             out.subreads.468--468.bam.pbi         out.subreads.494--494.subreadset.xml
out.subreads.054--054.bam.pbi         out.subreads.082--082.subreadset.xml  out.subreads.113--113.bam             out.subreads.185--185.bam.pbi         out.subreads.235--235.subreadset.xml  out.subreads.264--264.bam             out.subreads.313--313.bam.pbi         out.subreads.468--468.subreadset.xml  out.subreads.bam
out.subreads.054--054.subreadset.xml  out.subreads.085--085.bam             out.subreads.113--113.bam.pbi         out.subreads.185--185.subreadset.xml  out.subreads.241--241.bam             out.subreads.264--264.bam.pbi         out.subreads.313--313.subreadset.xml  out.subreads.471--471.bam             out.subreads.bam.pbi
out.subreads.055--055.bam             out.subreads.085--085.bam.pbi         out.subreads.113--113.subreadset.xml  out.subreads.187--187.bam             out.subreads.241--241.bam.pbi         out.subreads.264--264.subreadset.xml  out.subreads.316--316.bam             out.subreads.471--471.bam.pbi         out.subreads.json
out.subreads.055--055.bam.pbi         out.subreads.085--085.subreadset.xml  out.subreads.125--125.bam             out.subreads.187--187.bam.pbi         out.subreads.241--241.subreadset.xml  out.subreads.265--265.bam             out.subreads.316--316.bam.pbi         out.subreads.471--471.subreadset.xml  out.subreads.lima.counts
out.subreads.055--055.subreadset.xml  out.subreads.088--088.bam             out.subreads.125--125.bam.pbi         out.subreads.187--187.subreadset.xml  out.subreads.245--245.bam             out.subreads.265--265.bam.pbi         out.subreads.316--316.subreadset.xml  out.subreads.473--473.bam             out.subreads.lima.guess
out.subreads.059--059.bam             out.subreads.088--088.bam.pbi         out.subreads.125--125.subreadset.xml  out.subreads.188--188.bam             out.subreads.245--245.bam.pbi         out.subreads.265--265.subreadset.xml  out.subreads.317--317.bam             out.subreads.473--473.bam.pbi         out.subreads.lima.report
out.subreads.059--059.bam.pbi         out.subreads.088--088.subreadset.xml  out.subreads.143--143.bam             out.subreads.188--188.bam.pbi         out.subreads.245--245.subreadset.xml  out.subreads.273--273.bam             out.subreads.317--317.bam.pbi         out.subreads.473--473.subreadset.xml  out.subreads.lima.summary
out.subreads.059--059.subreadset.xml  out.subreads.090--090.bam             out.subreads.143--143.bam.pbi         out.subreads.188--188.subreadset.xml  out.subreads.249--249.bam             out.subreads.273--273.bam.pbi         out.subreads.317--317.subreadset.xml  out.subreads.479--479.bam             out.subreads.subreadset.xml
out.subreads.060--060.bam             out.subreads.090--090.bam.pbi         out.subreads.143--143.subreadset.xml  out.subreads.192--192.bam             out.subreads.249--249.bam.pbi         out.subreads.273--273.subreadset.xml  out.subreads.321--321.bam             out.subreads.479--479.bam.pbi

所以我想发送这些文件和这个变量:059--059

out.subreads.059--059.bam
out.subreads.059--059.bam.pbi
out.subreads.059--059.subreadset.xml

目前我在工作流程中的代码是:

process procA{
    input:
    file bc_fasta from bc_fasta_chan

    output:
    set file("$analysis_config.cell/bam/out.subreads.*"),val("$analysis_config.cell/bam/out.subreads.*") into lima_out

    script:
    ```
    // run script to generate the above generated files
    ```
}

process procB{
    input:
    set file(bc_bam_file),val(bc_name) from lima_out.flatten()

    script:
    """
    ls
    echo ${bc_bam_file}
    """
}

解决方法

诀窍是能够以某种方式从文件名中提取分组变量,然后调用 groupTuple。我刚刚使用了一个简单的正则表达式来获取这个变量,但如果需要,您可以实现更复杂的东西:

lima_out = Channel.fromPath( './files/out.subreads.*',relative: true )

subreads_pattern = ~/^out\.subreads\.(\d{3}--\d{3})\..*/

lima_out
    .flatten()
    .filter { it.name =~ subreads_pattern }
    .map { tuple( (it.name =~ subreads_pattern)[0][1],it ) }
    .groupTuple(size: 3,sort: true)
    .view()

结果:

[489--489,[out.subreads.489--489.bam,out.subreads.489--489.bam.pbi,out.subreads.489--489.subreadset.xml]]
[316--316,[out.subreads.316--316.bam,out.subreads.316--316.bam.pbi,out.subreads.316--316.subreadset.xml]]
...

这是我如何将这些值输入到流程中的示例。我对处理伴随文件的偏好(在这种情况下,我们有带有“.bam.pbi”扩展名的文件)是将它们与 BAM 文件一起保存。我只是为此使用了一个元组。通过对我们的元组调用 first(),我们可以获得 BAM。不过这只是我的偏好。您可以在 pbi 配套文件的输入元组中有一个单独的文件/路径变量,但您可能不需要在脚本块中引用它。

lima_out = Channel.fromPath( './files/out.subreads.*',sort: true)
    .map { group_name,files -> tuple( group_name,files[2],files[0..1] ) }
    .set { subreads_ch }

process next_process {

    input:
    tuple val(group),path(subreadset),path(indexed_subreads) from subreads_ch

    """
    echo "subreadset XML: ${subreadset}"
    echo "subreads BAM: ${indexed_subreads.first()}"
    """
}

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...