如何使用已发布管道的 PipelineParameter 来指定带有 DatabricksStep 的 databricks notebook 小部件?

问题描述

我创建了一个包含几个步骤的管道(azureml-defaults==1.23.0)。当我从工作室运行已发布的管道时,无论我在提交管道时选择什么值,databricks 步骤始终采用 PipelineParameter认值。

parser.add_argument('--import_date',type=str,default = "2021-04-23")   
....
....

import_date      = PipelineParameter(name="import_date",default_value = params.import_date)
cluster_id       = PipelineParameter(name="cluster_id",default_value = params.cluster_id)
step_type        = PipelineParameter(name="step_type",default_value = params.step_type)
churn_months     = PipelineParameter(name="churnMonths",default_value = params.churnMonths)



data_import_step = Databricksstep(name="Databricks Data Import Step",existing_cluster_id=str(cluster_id.default_value),notebook_path=import_notebook_path,notebook_params={'churnMonthsWidget': churn_months,'startDateWidget'  : port_start_date,'ImportDateWidget' : import_date,'StepTypeWidget'   : step_type},run_name='Job_Data_Import',compute_target=databricks_compute,allow_reuse=False)
.....
.....
.....
pipeline_steps = StepSequence(steps=[data_import_step                  #Step 1,data_manipulation_step            #Step 2,data_extraction_step              #Step 3,training_data_preparation_step    #Step 4,model_training_step               #Step 5,prediction_data_preparation_step  #Step 6,prediction_step                   #Step 7
                                    ])
                                    
pipeline = Pipeline(workspace = ws,steps=pipeline_steps)

published_pipeline = pipeline.publish(name        = params.pipeline_name,description = params.pipeline_description)

import_date 的认值是 2021-04-23。即使我设置 '2021-04-22' 的 import_date 参数,databricks notebook 仍然将 2021-04-23 作为 import_date。

databricks 笔记本有以下小部件


today = str(date.today())
dbutils.widgets.text("ImportDateWidget",today,label = "ImportDate")
import_date = dbutils.widgets.get("ImportDateWidget")


startDate = "2019-01-01"  
dbutils.widgets.text("startDateWidget",startDate,label = "startDate")
start_date = dbutils.widgets.get("startDateWidget")

churnMonths = 3
dbutils.widgets.text("churnMonthsWidget",str(churnMonths),label = "churnMonths")

step_type = "Training"
dbutils.widgets.text("StepTypeWidget",step_type,label ="StepType" )

N_MONTHS_churn = int(dbutils.widgets.get("churnMonthsWidget"))
import_date    = dbutils.widgets.get("ImportDateWidget")
start_date     = dbutils.widgets.get("startDateWidget")
N_MONTHS_churn = int(dbutils.widgets.get("churnMonthsWidget"))
step_type = dbutils.widgets.get("StepTypeWidget")

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)