访问 Tensorflow Cloud 中的项目被拒绝

问题描述

谁能告诉我应该为我的服务帐户分配哪些角色来解决这个问题?我本质上是在尝试从我的笔记本电脑而不是通过 CoLab 启动 tensorflow_cloud 教程。在整个过程中,我得到了:

googleapiclient.errors.HttpError:请求 https://ml.googleapis.com/v1/projects//jobs?alt=json 时 提交作业时出错。

重试没有成功。

TF_cloud 似乎成功构建了 docker 镜像,但是我不知道它是否在推送镜像以构建/存储、尝试检索它、启动计算引擎或其他方面卡住了。无论如何,被拒绝的是对项目的许可而不是对其他事物的许可,这似乎特别奇怪。因此,我猜想某些服务帐户需要适当的角色,但我不知道该角色是什么。主要服务帐号已启用以下角色:

  • roles/cloudbuild.builds.approver
  • roles/cloudbuild.builds.editor
  • roles/cloudbuild.builds.viewer
  • roles/storage.admin
  • roles/storage.objectAdmin

TF_Cloud 似乎已经创建了几个其他服务帐户并为它们分配了以下角色:

  • roles/cloudbuild.builds.builder
  • roles/cloudbuild.serviceAgent
  • roles/cloudbuild.workerPoolUser
  • roles/cloudfunctions.developer
  • roles/compute.instanceAdmin.v1
  • roles/compute.serviceAgent
  • roles/containerregistry.ServiceAgent
  • 角色/编辑
  • roles/iam.serviceAccountUser
  • roles/ml.serviceAgent
  • roles/pubsub.serviceAgent

如果您觉得有帮助,这里是一些代码输出。它由典型的 hello_world mnist 模型组成。

x_train = x_train.reshape((60000,28 * 28))
x_train = x_train.astype('float32') / 255

model = tf.keras.Sequential([
  tf.keras.layers.Dense(512,activation='relu',input_shape=(28 * 28,)),tf.keras.layers.Dropout(0.2),tf.keras.layers.Dense(10,activation='softmax')
])

model.compile(loss='sparse_categorical_crossentropy',optimizer=tf.keras.optimizers.Adam(),metrics=['accuracy'])

然后是一些独特的 tensorflow_cloud 代码

if tfc.remote():
    # Configure Tensorboard logs
    callbacks=[
        tf.keras.callbacks.TensorBoard(log_dir=TENSORBOARD_LOGS_DIR),tf.keras.callbacks.ModelCheckpoint(
            MODEL_CHECKPOINT_DIR,save_best_only=True),tf.keras.callbacks.EarlyStopping(
            monitor='loss',min_delta =0.001,patience=3)]

    model.fit(x=x_train,y=y_train,epochs=100,validation_split=0.2,callbacks=callbacks)

    model.save(SAVED_MODEL_DIR)

with open('requirements.txt','w') as f:
    f.write('tensorflow-cloud\n')

# Optional: Some recommended base images. If you provide none the system
# will choose one for you.
TF_GPU_IMAGE = "gcr.io/deeplearning-platform-release/tf2-gpu.2-5"

# Submit a single node training job using GPU.
tfc.run(
    distribution_strategy='auto',requirements_txt='requirements.txt',docker_config=tfc.DockerConfig(
        parent_image=TF_GPU_IMAGE,image_build_bucket=GCS_BUCKET
        ),chief_config=tfc.COMMON_MACHINE_CONfigS['K80_1X'],job_labels={'job': JOB_NAME}
)

输出

3/3 [==============================] - 1s 82ms/step - loss: 2.2664 - accuracy: 0.2375 - val_loss: 1.6802 - val_accuracy: 0.6500
Validating environment and input parameters.
Validation was successful.
Building and pushing the Docker image. This may take a few minutes.
INFO:tensorflow_cloud.core.containerize:Uploading files to GCS.
INFO:tensorflow_cloud.core.containerize:Building and publishing Docker image using Google Cloud Build: gcr.io/tf-cloud-319802/tf_cloud_train:cb8729ce_5cac_42c3_a1ff_6e915c501bda
Submitting Docker build and push request to Cloud Build.
Please access your Cloud Build job information here:
https://console.cloud.google.com/cloud-build/builds
INFO:absl:Detected running in DL_CONTAINER environment.
INFO:absl:Detected running in DL_CONTAINER environment.
Waiting for Cloud Build,checking status in 30 seconds.
INFO:absl:Detected running in DL_CONTAINER environment.
Waiting for Cloud Build,checking status in 30 seconds.
#It keeps doing this for a while and then...]
INFO:absl:Detected running in DL_CONTAINER environment.
WARNING:googleapiclient.http:Invalid JSON content from response: b'{\n  "error": {\n    "code": 403,\n    "message": "Access to project denied. This might be a transient error and a retry may succeed. If the error persists,please check the IAM permissions on your project.",\n    "status": "PERMISSION_DENIED"\n  }\n}\n'
Traceback (most recent call last):
  File "/Users/chrisdrymon/PycharmProjects/tf_cloud_test/working_tutorial.py",line 80,in <module>
    job_labels={'job': JOB_NAME}
  File "/Users/chrisdrymon/PycharmProjects/tf_cloud_test/venv/lib/python3.7/site-packages/tensorflow_cloud/core/run.py",line 327,in run
    service_account=service_account,File "/Users/chrisdrymon/PycharmProjects/tf_cloud_test/venv/lib/python3.7/site-packages/tensorflow_cloud/core/deploy.py",line 98,in deploy_job
    raise err
  File "/Users/chrisdrymon/PycharmProjects/tf_cloud_test/venv/lib/python3.7/site-packages/tensorflow_cloud/core/deploy.py",line 90,in deploy_job
    .create(parent="projects/{}".format(project_id),body=request_dict)
  File "/Users/chrisdrymon/PycharmProjects/tf_cloud_test/venv/lib/python3.7/site-packages/googleapiclient/_helpers.py",line 134,in positional_wrapper
    return wrapped(*args,**kwargs)
  File "/Users/chrisdrymon/PycharmProjects/tf_cloud_test/venv/lib/python3.7/site-packages/googleapiclient/http.py",line 915,in execute
    raise HttpError(resp,content,uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://ml.googleapis.com/v1/projects/tf-cloud-319802/jobs?alt=json returned "Access to project denied. This might be a transient error and a retry may succeed. If the error persists,please check the IAM permissions on your project.". Details: "Access to project denied. This might be a transient error and a retry may succeed. If the error persists,please check the IAM permissions on your project.">
There was an error submitting the job.

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)