我可以将音量安装到Katib Experiment吗?

问题描述

我正在使用下面的.yaml文件在Kubeflow中创建Katib实验。但是,我得到了

无法协调:无法从以下字符串恢复结构

错误。有什么解决办法吗?大部分Katib实验示例代码中没有卷,但是我试图从S3下载数据后挂载卷。

apiVersion: "kubeflow.org/v1alpha3"
kind: Experiment
metadata:
  namespace: apple
  labels:
    controller-tools.k8s.io: "1.0"
  name: transformer-experiment
spec:
  objective:
    type: maximize
    goal: 0.8
    objectiveMetricName: Train-accuracy
    additionalMetricNames:
      - Train-loss
  algorithm:
    algorithmName: random
  parallelTrialCount: 3
  maxTrialCount: 12
  maxFailedTrialCount: 3
  metricsCollectorSpec:
    collector:
      kind: StdOut
  parameters:
    - name: --lr
      parameterType: double
      feasibleSpace:
        min: "0.01"
        max: "0.03"
    - name: --dropout_rate
      parameterType: double
      feasibleSpace:
        min: "0.005"
        max: "0.020"
    - name: --layer_count
      parameterType: int
      feasibleSpace:
        min: "2"
        max: "5"
    - name: --d_model_count
      parameterType: categorical
      feasibleSpace:
        list:
        - "64"
        - "128"
        - "256"
  trialTemplate:
    goTemplate:
        rawTemplate: |-
          apiVersion: batch/v1
          kind: Job
          metadata:
            name: {{.Trial}}
            namespace: {{.NameSpace}}
          spec:
            template:
              spec:
                volumes:
                - name: train-data
                  emptyDir: {}
                containers:
                - name: data-download
                  image: amazon/aws-cli
                  command:
                  - "aws s3 sync s3://kubeflow/kubeflowdata.tar.gz /train-data"
                  volumeMounts:
                  - name: train-data
                    mountPath: /train-data
                - name: {{.Trial}}
                  image: <Our Image>
                  command:
                  - "cd /train-data"
                  - "ls"
                  - "python"
                  - "/opt/ml/src/main.py"
                  - "--train_batch=64"
                  - "--test_batch=64"
                  - "--num_workers=4"
                  volumeMounts:
                  - name: train-data
                    mountPath: /train-data
                  {{- with .HyperParameters}}
                  {{- range .}}
                  - "{{.Name}}={{.Value}}"
                  {{- end}}
                  {{- end}}
                restartPolicy: Never

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...