如何在训练步骤中结合 AzureML SDK 中的管道和超参数

问题描述

简写: 我想弄清楚如何在管道中的训练步骤(即 train_step = PythonScriptStep(...))中运行超参数,我不确定我应该把“config=超光速驱动器"

长格式:

一般:

# Register the environment 
diabetes_env.register(workspace=ws)
registered_env = Environment.get(ws,'diabetes-pipeline-env')

# Create a new runconfig object for the pipeline
run_config = runconfiguration()

# Use the compute you created above. 
run_config.target = ComputerTarget_Crea

# Assign the environment to the run configuration
run_config.environment = registered_env

超参数:

script_config = Scriptrunconfig(source_directory=experiment_folder,script='diabetes_training.py',# Add non-hyperparameter arguments -in this case,the training dataset
                                arguments = ['--input-data',diabetes_ds.as_named_input('training_data')],environment=sklearn_env,compute_target = training_cluster)

# Sample a range of parameter values
params = GridParameterSampling(
    {
        # Hyperdrive will try 6 combinations,adding these as script arguments
        '--learning_rate': choice(0.01,0.1,1.0),'--n_estimators' : choice(10,100)
    }
)

# Configure hyperdrive settings
hyperdrive = HyperDriveConfig(run_config=script_config,hyperparameter_sampling=params,policy=None,# No early stopping policy
                          primary_metric_name='AUC',# Find the highest AUC metric
                          primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,max_total_runs=6,# Restict the experiment to 6 iterations
                          max_concurrent_runs=2) # Run up to 2 iterations in parallel

# Run the experiment if I only want to run hyperparam alone without the pipeline
#experiment = Experiment(workspace=ws,name='mslearn-diabetes-hyperdrive')
#run = experiment.submit(**config=hyperdrive**)

管道:

prep_step = PythonScriptStep(name = "Prepare Data",source_directory = experiment_folder,script_name = "prep_diabetes.py",arguments = ['--input-data',diabetes_ds.as_named_input('raw_data'),'--prepped-data',prepped_data_folder],outputs=[prepped_data_folder],compute_target = ComputerTarget_Crea,runconfig = run_config,allow_reuse = True)

# Step 2,run the training script
train_step = PythonScriptStep(name = "Train and Register Model",script_name = "train_diabetes.py",arguments = ['--training-folder',inputs=[prepped_data_folder],allow_reuse = True)
# Construct the pipeline
pipeline_steps = [prep_step,train_step]
pipeline = Pipeline(workspace=ws,steps=pipeline_steps)
print("Pipeline is built.")

# Create an experiment and run the pipeline
**#How do I need to change these below lines to use hyperdrive????**
experiment = Experiment(workspace=ws,name = 'mslearn-diabetes-pipeline')
pipeline_run = experiment.submit(pipeline,regenerate_outputs=True)

不确定我需要将 config=hyperdrive 放在 Pipeline 部分的什么位置?

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)