返回带有 lambda 函数的文件列表时,Snakemake 上的 InputFunction 错误

问题描述

我正在编写一条蛇形规则,该规则将从已解析的 yaml 中获取输入值并将与该组标签关联的文件作为列表返回,但我遇到了一个奇怪的错误

我的函数在返回之前打印了返回输出,所以它似乎正在返回一个列表

['/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl1_featureCounts_results.txt','/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl4_featureCounts_results.txt','/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl2_featureCounts_results.txt','/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl3_featureCounts_results.txt']

然而,我得到了一个“AttributeError”,这是出乎意料的,因为我直接从之前的一些管道中改编了这个,这个管道与这个函数完美地配合

InputFunctionException in line 26 of /SAN/vyplab/alb_projects/pipelines/rna_seq_snakemake/rules/deseq2_featureCounts.smk:
AttributeError: 'str' object has no attribute 'list'
Wildcards:
bse=control
contrast=ContrastvControl

规则看起来像这样,我省略了 shell 和 params 调用,因为我认为它们不需要调试

rule run_standard_deseq:
    input:
        base_group = lambda wildcards: featurecounts_files_from_contrast(wildcards.bse),contrast_group = lambda wildcards: featurecounts_files_from_contrast(wildcards.contrast)
    output:
        os.path.join(DESEQ2_DIR,"{bse}_{contrast}" + "normed_counts.csv.gz")

辅助函数的实现

def featurecounts_files_from_contrast(grp):
    """
    given a contrast name or list of groups return a list of the files in that group
    """
    #reading in the samples
    samples = pd.read_csv(config['sampleCSVpath'])
    #there should be a column which allows you to exclude samples
    samples2 = samples.loc[samples.exclude_sample_downstream_analysis != 1]
    #read in the comparisons and make a dictionary of comparisons,comparisons needs to be in the config file
    compare_dict = load_comparisons()
    #go through the values of the dictionary and break when we find the right groups in that contrast
    grps,comparison_column = return_sample_names_group(grp)
    #take the sample names corresponding to those groups
    if comparison_column == "":
        return([""])
    grp_samples = list(set(list(samples2[samples2[comparison_column].isin(grps)].sample_name)))
    feature_counts_outdir = get_output_dir(config["project_top_level"],config["feature_counts_output_folder"])
    fc_suffix = "_featureCounts_results.txt"

    #build a list with the full path from those sample names
    fc_files = [os.path.join(feature_counts_outdir,x + fc_suffix) \
                   for x in grp_samples]
    fc_files = list(set(fc_files))
    print(fc_files)

    return(fc_files)

打印命令返回正确的文件,所以我认为这会起作用

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)