问题描述
谁能告诉我应该为我的服务帐户分配哪些角色来解决这个问题?我本质上是在尝试从我的笔记本电脑而不是通过 CoLab 启动 tensorflow_cloud 教程。在整个过程中,我得到了:
googleapiclient.errors.HttpError:请求 https://ml.googleapis.com/v1/projects//jobs?alt=json 时
重试没有成功。
TF_cloud 似乎成功构建了 docker 镜像,但是我不知道它是否在推送镜像以构建/存储、尝试检索它、启动计算引擎或其他方面卡住了。无论如何,被拒绝的是对项目的许可而不是对其他事物的许可,这似乎特别奇怪。因此,我猜想某些服务帐户需要适当的角色,但我不知道该角色是什么。主要服务帐号已启用以下角色:
- roles/cloudbuild.builds.approver
- roles/cloudbuild.builds.editor
- roles/cloudbuild.builds.viewer
- roles/storage.admin
- roles/storage.objectAdmin
TF_Cloud 似乎已经创建了几个其他服务帐户并为它们分配了以下角色:
- roles/cloudbuild.builds.builder
- roles/cloudbuild.serviceAgent
- roles/cloudbuild.workerPoolUser
- roles/cloudfunctions.developer
- roles/compute.instanceAdmin.v1
- roles/compute.serviceAgent
- roles/containerregistry.ServiceAgent
- 角色/编辑
- roles/iam.serviceAccountUser
- roles/ml.serviceAgent
- roles/pubsub.serviceAgent
如果您觉得有帮助,这里是一些代码和输出。它由典型的 hello_world mnist 模型组成。
x_train = x_train.reshape((60000,28 * 28))
x_train = x_train.astype('float32') / 255
model = tf.keras.Sequential([
tf.keras.layers.Dense(512,activation='relu',input_shape=(28 * 28,)),tf.keras.layers.Dropout(0.2),tf.keras.layers.Dense(10,activation='softmax')
])
model.compile(loss='sparse_categorical_crossentropy',optimizer=tf.keras.optimizers.Adam(),metrics=['accuracy'])
然后是一些独特的 tensorflow_cloud 代码:
if tfc.remote():
# Configure Tensorboard logs
callbacks=[
tf.keras.callbacks.TensorBoard(log_dir=TENSORBOARD_LOGS_DIR),tf.keras.callbacks.ModelCheckpoint(
MODEL_CHECKPOINT_DIR,save_best_only=True),tf.keras.callbacks.EarlyStopping(
monitor='loss',min_delta =0.001,patience=3)]
model.fit(x=x_train,y=y_train,epochs=100,validation_split=0.2,callbacks=callbacks)
model.save(SAVED_MODEL_DIR)
with open('requirements.txt','w') as f:
f.write('tensorflow-cloud\n')
# Optional: Some recommended base images. If you provide none the system
# will choose one for you.
TF_GPU_IMAGE = "gcr.io/deeplearning-platform-release/tf2-gpu.2-5"
# Submit a single node training job using GPU.
tfc.run(
distribution_strategy='auto',requirements_txt='requirements.txt',docker_config=tfc.DockerConfig(
parent_image=TF_GPU_IMAGE,image_build_bucket=GCS_BUCKET
),chief_config=tfc.COMMON_MACHINE_CONfigS['K80_1X'],job_labels={'job': JOB_NAME}
)
和输出:
3/3 [==============================] - 1s 82ms/step - loss: 2.2664 - accuracy: 0.2375 - val_loss: 1.6802 - val_accuracy: 0.6500
Validating environment and input parameters.
Validation was successful.
Building and pushing the Docker image. This may take a few minutes.
INFO:tensorflow_cloud.core.containerize:Uploading files to GCS.
INFO:tensorflow_cloud.core.containerize:Building and publishing Docker image using Google Cloud Build: gcr.io/tf-cloud-319802/tf_cloud_train:cb8729ce_5cac_42c3_a1ff_6e915c501bda
Submitting Docker build and push request to Cloud Build.
Please access your Cloud Build job information here:
https://console.cloud.google.com/cloud-build/builds
INFO:absl:Detected running in DL_CONTAINER environment.
INFO:absl:Detected running in DL_CONTAINER environment.
Waiting for Cloud Build,checking status in 30 seconds.
INFO:absl:Detected running in DL_CONTAINER environment.
Waiting for Cloud Build,checking status in 30 seconds.
#It keeps doing this for a while and then...]
INFO:absl:Detected running in DL_CONTAINER environment.
WARNING:googleapiclient.http:Invalid JSON content from response: b'{\n "error": {\n "code": 403,\n "message": "Access to project denied. This might be a transient error and a retry may succeed. If the error persists,please check the IAM permissions on your project.",\n "status": "PERMISSION_DENIED"\n }\n}\n'
Traceback (most recent call last):
File "/Users/chrisdrymon/PycharmProjects/tf_cloud_test/working_tutorial.py",line 80,in <module>
job_labels={'job': JOB_NAME}
File "/Users/chrisdrymon/PycharmProjects/tf_cloud_test/venv/lib/python3.7/site-packages/tensorflow_cloud/core/run.py",line 327,in run
service_account=service_account,File "/Users/chrisdrymon/PycharmProjects/tf_cloud_test/venv/lib/python3.7/site-packages/tensorflow_cloud/core/deploy.py",line 98,in deploy_job
raise err
File "/Users/chrisdrymon/PycharmProjects/tf_cloud_test/venv/lib/python3.7/site-packages/tensorflow_cloud/core/deploy.py",line 90,in deploy_job
.create(parent="projects/{}".format(project_id),body=request_dict)
File "/Users/chrisdrymon/PycharmProjects/tf_cloud_test/venv/lib/python3.7/site-packages/googleapiclient/_helpers.py",line 134,in positional_wrapper
return wrapped(*args,**kwargs)
File "/Users/chrisdrymon/PycharmProjects/tf_cloud_test/venv/lib/python3.7/site-packages/googleapiclient/http.py",line 915,in execute
raise HttpError(resp,content,uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://ml.googleapis.com/v1/projects/tf-cloud-319802/jobs?alt=json returned "Access to project denied. This might be a transient error and a retry may succeed. If the error persists,please check the IAM permissions on your project.". Details: "Access to project denied. This might be a transient error and a retry may succeed. If the error persists,please check the IAM permissions on your project.">
There was an error submitting the job.
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)