问题描述
我正在使用AMLS训练模型。我有一个训练管道,其中第1步训练模型,然后使用
将输出保存在临时数据存储区model_folder中os.makedirs(output_folder,exist_ok=True)
output_path = output_folder + "/model.pkl"
joblib.dump(value=model,filename=output_path)
model_folder = PipelineData("model_folder",datastore=ws.get_default_datastore())
但是,步骤1尝试使用以下ServiceError保存模型时失败:
由于异常而无法上载输出:Microsoft.RelInfra.Common.Exceptions.OperationFailedException:无法上载输出xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx。 ---> Microsoft.WindowsAzure.Storage.StorageException:无权使用此权限执行此操作。
与默认数据存储区进行交互default_ds = ws.get_default_datastore()
default_ds.upload_files(...)
我的70_driver_log.txt
如下:
[2020-08-25T04:03:27.315114] Entering context manager injector.
[context_manager_injector.py] Command line Options: Namespace(inject=['ProjectPythonPath:context_managers.ProjectPythonPath','RunHistory:context_managers.RunHistory','TrackUserError:context_managers.TrackUserError'],invocation=['train_word2vec.py','--output_folder','/mnt/batch/tasks/shared/LS_root/jobs/aiworkspace/azureml/xxxxx/mounts/workspaceblobstore/azureml/xxxxx/model_folder','--model_type','WO','--training_field','task_title','--regex','1','--stopword_removal','--tokenize_basic','0','--remove_punctuation','--autocorrect','--lemmatization','--word_vector_length','152','--model_learning_rate','0.025','--model_min_count','--model_window','7','--num_epochs','10'])
Starting the daemon thread to refresh tokens in background for process with pid = 113
Entering Run History Context Manager.
Current directory: /mnt/batch/tasks/shared/LS_root/jobs/aiworkspace/azureml/xxxxx/mounts/workspaceblobstore/azureml/xxxxx
Preparing to call script [ train_word2vec.py ] with arguments: ['--output_folder','10']
After variable expansion,calling script [ train_word2vec.py ] with arguments: ['--output_folder','10']
Script type = None
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data] Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Unzipping corpora/wordnet.zip.
OUTPUT FOLDER: /mnt/batch/tasks/shared/LS_root/jobs/aiworkspace/azureml/xxxxx/mounts/workspaceblobstore/azureml/xxxxx/model_folder
Loading sql data...
Loading abbreviation data...
/azureml-envs/azureml_xxxxx/lib/python3.6/site-packages/pandas/core/indexing.py:1783: SettingWithcopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self.obj[item_labels[indexer[info_axis]]] = value
Pre-processing data...
Succesfully pre-processed the the text data
Training Word2Vec model...
Saving the model...
Starting the daemon thread to refresh tokens in background for process with pid = 113
The experiment completed successfully. Finalizing run...
[2020-08-25T04:03:52.293994] TimeoutHandler __init__
[2020-08-25T04:03:52.294149] TimeoutHandler __enter__
Cleaning up all outstanding Run operations,waiting 300.0 seconds
2 items cleaning up...
Cleanup took 0.44109439849853516 seconds
[2020-08-25T04:03:52.818991] TimeoutHandler __exit__
2020/08/25 04:04:00 logger.go:293: Process Exiting with Code: 0
我的arg解析参数包括
parser.add_argument('--output_folder',type=str,dest='output_folder',default="output_folder",help='output folder')
解决方法
一些想法:
- 这是@drum所建议的,并且是权限错误。
- 您的
os.path.join(output_folder,'model.pkl')
的打字错误 - 如果使用{{1}},是否会发生相同的错误?
通过将我的AMLS工作区添加到AMLS默认存储帐户中的“存储Blob数据贡献者”角色来解决此问题。看来,通常默认情况下会添加此角色,但就我而言,这没有发生。