缺少 nextflow 进程所需的输出文件

问题描述

我有一个 nextflow 进程，可以输入多个文件，然后输出一些文件。在此过程中，我删除了条件中的空文件。

    process imputation {
    input:
    set val(chrom),val(chunk_array),val(chunk_start),val(chunk_end),path(in_haps),path(refs),path(maps) from imp_ch
    output:
    tuple val("${chrom}"),path("${chrom}.*") into imputed
    script:
    def (haps,sample)=in_haps
    def (haplotype,legend,samples)=refs
    """
    impute4 -g "${haps}" -h "${haplotype}" -l "${legend}" -m "${maps}" -o "${chrom}.imputed.chunk${chunk_array}" -no_maf_align -o_gz -int "${chunk_start}" "${chunk_end}" -Ne 20000 -buffer 1000 -seed 54321
    if [[ \$(gunzip -c "${chrom}.imputed.chunk${chunk_array}.gen.gz" | head -c1 | wc -c) == "0"]]
    then
     rm "${chrom}.imputed.chunk${chunk_array}.gen.gz"
    else
     qctools -g "${chrom}.imputed.chunk${chunk_array}.gen.gz" -snp-stats -osnp "${chrom}.imputed.chunk${chunk_array}.snp.stats"
    fi
    """
    }

该过程运行良好。 impute4 程序给出 *gen.gz 文件的输出，其中一些可能是空的。因此，添加了 if 语句来删除那些空文件，因为 qctools 无法读取空文件并且进程崩溃。问题是，现在我收到错误：

Missing output file(s) `chr16*` expected by process `imputation (165)` (note: input files are not included in the default matching set)

我该如何解决这个问题。有什么帮助吗？

解决方法

使用可选模式作为 suggested by user jfy133 将是解决您问题的一种方法。在任何情况下，您都可能希望在单独的进程中拆分这两个命令。

您还可以存储在 if 子句中使用的行数或测试语句，并在运行 {{1} 之前在第一个进程的输出通道上使用 nextflow filter 或 branch 运算符}

Filter：

qctools

Branch：

Channel
    .from( 1,2,3,4,5 )
    .filter { it % 2 == 1 }

您的解决方案可能如下所示

Channel
    .from(1,40,50)
    .branch {
        small: it < 10
        large: it > 10
    }
    .set { result }

 result.small.view { "$it is small" }
 result.large.view { "$it is large" }

this nextflow 模式有帮助吗？

简短版本：

process foo {
  output:
  file 'foo.txt' optional true into foo_ch

  script:
  '''
  your_command
  '''
}

基本上通过指定输出是可选的，如果它没有找到任何定义的输出 glob，过程就不会失败。

然而，根据输出的文件数量，您可能希望在输出声明中更具体地说明需要哪些类型的输出文件，哪些是可选的，以确保如果所有命令都失败（无论出于何种原因），您的进程仍然失败)

nextflow

缺少 nextflow 进程所需的输出文件

问题描述

解决方法

相关问答