Argo Workflows Pod 缺少 CPU/内存资源

问题描述

我在提交 Workflow 时遇到了资源缺失问题。 Kubernetes 命名空间 my-namespace 启用了配额,无论出于何种原因,在提交工作流后创建的 pod 都失败了:

pods "hello" is forbidden: Failed quota: team: must specify limits.cpu,limits.memory,requests.cpu,requests.memory

我正在提交以下Workflow

apiVersion: "argoproj.io/v1alpha1"
kind: "Workflow"
Metadata:
  name: "hello"
  namespace: "my-namespace"
spec:
  entrypoint: "main"
  templates:
  - name: "main"
    container:
      image: "docker/whalesay"
      resources:
        requests:
          memory: 0
          cpu: 0
        limits:
          memory: "128Mi"
          cpu: "250m"

Argo 在 Kubernetes 1.19.6 上运行,并与 official Helm chart 版本 0.16.10 一起部署。这是我的 Helm 价值观:

controller:
  workflowNamespaces:
  - "my-namespace"
  resources:
    requests:
      memory: 0
      cpu: 0
    limits:
      memory: 500Mi
      cpu: 0.5
  pdb:
    enabled: true
  # See https://argoproj.github.io/argo-workflows/workflow-executors/
  # docker container runtime is not present in the TKGI clusters
  containerRuntimeExecutor: "k8sapi"
workflow:
  namespace: "my-namespace"
  serviceAccount:
    create: true
  rbac:
    create: true
server:
  replicas: 2
  secure: false
  resources:
    requests:
      memory: 0
      cpu: 0
    limits:
      memory: 500Mi
      cpu: 0.5
  pdb:
    enabled: true
executer:
  resources:
    requests:
      memory: 0
      cpu: 0
    limits:
      memory: 500Mi
      cpu: 0.5

关于我可能遗漏的任何想法?谢谢,韦尔登

更新 1:我尝试了另一个未启用配额的命名空间并解决了缺少资源的问题。但是我现在看到:Failed to establish pod watch: timed out waiting for the condition。以下是此 Pod 的 spec 外观。您可以看到 wait 容器缺少 resources。这是导致此问题报告的问题的容器。

spec:
  containers:
  - command:
    - argoexec
    - wait
    env:
    - name: ARGO_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldpath: Metadata.name
    - name: ARGO_CONTAINER_RUNTIME_EXECUTOR
      value: k8sapi
    image: argoproj/argoexec:v2.12.5
    imagePullPolicy: IfNotPresent
    name: wait
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /argo/podMetadata
      name: podMetadata
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-v4jlb
      readOnly: true
  - image: docker/whalesay
    imagePullPolicy: Always
    name: main
    resources:
      limits:
        cpu: 250m
        memory: 128Mi
      requests:
        cpu: "0"
        memory: "0"

解决方法

如果可以,请尝试在另一个命名空间上部署工作流,并验证它是否有效。

如果您可以尝试删除相应命名空间的配额。

除了配额,您还可以使用

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limit-range
spec:
  limits:
  - default:
      memory: 512Mi
      cpu: 250m
    defaultRequest:
      cpu: 50m
      memory: 64Mi
    type: Container

所以任何容器都没有资源请求,提到的限制将获得 50m CPU 和 64 Mi 内存的默认配置。

https://kubernetes.io/docs/concepts/policy/limit-range/