AWS Xray DaemonSet Pod 错误:无法启动容器“xray-daemon”:来自守护程序的错误响应:OCI

问题描述

我正在尝试在 EKS Kubernetes 集群中设置 AWS Xray Daemonset

问题在于 "xray-daemon" pod 无法以 "CrashLoopBackOff" 状态启动。

当我查看daemonset pods的日志时,显示如下错误

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
normal Scheduled 20s default-scheduler Successfully assigned default/xray-daemon-wcnzt to ip-172-34-169-37.ap-south-1.compute.internal
normal Pulling 15s (x2 over 19s) kubelet Pulling image "amazon/aws-xray-daemon:latest"
normal Pulled 12s (x2 over 16s) kubelet Successfully pulled image "amazon/aws-xray-daemon:latest"
normal Created 12s (x2 over 16s) kubelet Created container xray-daemon
Warning Failed 12s (x2 over 16s) kubelet Error: Failed to start container "xray-daemon": Error response from daemon: OCI runtime create Failed: container_linux.go:370: starting container process caused: exec: "/usr/bin/xray": stat /usr/bin/xray: no such file or directory: unkNown

重现步骤:

首先,我使用 AWSXRayDaemonWriteAccess

创建 IAM 服务帐户
eksctl create iamserviceaccount \
    --name xray-daemon \
    --namespace default \
    --cluster eksdemo1 \
    --attach-policy-arn arn:aws:iam::aws:policy/AWSXRayDaemonWriteAccess \
    --approve \
    --override-existing-serviceaccounts

然后我尝试使用这个 "xray-k8s-daemonset.yml" 文件创建 xray daemonset

apiVersion: v1
kind: ServiceAccount
Metadata:
  labels:
    app: xray-daemon
  name: xray-daemon
  namespace: default
  # Update IAM Role ARN created for X-Ray access
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::12345678999:role/eksctl-eksdemo1-addon-iamserviceaccount-defa-Role1-1FIM8S2K6D404
---
apiVersion: apps/v1
kind: DaemonSet
Metadata:
  name: xray-daemon
  namespace: default
spec:
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app: xray-daemon
  template:
    Metadata:
      labels:
        app: xray-daemon
    spec:
      serviceAccountName: xray-daemon
      volumes:
        - name: config-volume
          configMap:
            name: "xray-config"
      containers:
        - name: xray-daemon
          image: amazon/aws-xray-daemon
          command: ["/usr/bin/xray","-c","/aws/xray/config.yaml"]
          resources:
            requests:
              cpu: 256m
              memory: 32Mi
            limits:
              cpu: 512m
              memory: 64Mi
          ports:
            - name: xray-ingest
              containerPort: 2000
              hostPort: 2000
              protocol: UDP
            - name: xray-tcp
              containerPort: 2000
              hostPort: 2000
              protocol: TCP
          volumeMounts:
            - name: config-volume
              mountPath: /aws/xray
              readOnly: true
---
# Configuration for AWS X-Ray daemon
apiVersion: v1
kind: ConfigMap
Metadata:
  name: xray-config
  namespace: default
data:
  config.yaml: |-
    TotalBufferSizeMB: 24
    Socket:
      UDPAddress: "0.0.0.0:2000"
      TCPAddress: "0.0.0.0:2000"
    Version: 2
---
# k8s service deFinition for AWS X-Ray daemon headless service
apiVersion: v1
kind: Service
Metadata:
  name: xray-service
  namespace: default
spec:
  selector:
    app: xray-daemon
  clusterIP: None
  ports:
    - name: xray-ingest
      port: 2000
      protocol: UDP
    - name: xray-tcp
      port: 2000
      protocol: TCP

角色 arn 是正确的,我确定(我在 AWS 控制台中检查了 arn)

解决方法

此问题已通过使用“amazon/aws-xray-daemon:3.2.0”docker 映像修复

"xray-k8s-daemonset.yml" 中的 pod 模板似乎无法在最新版本的 "amazon/aws-xray-daemon" 中正常工作> 图像不再。

从映像版本 3.3.0 开始,有一些更改,这会破坏 "xray-k8s-daemonset.yml" 中的 xray daemonset 部署>.

所以我在 "xray 中用 "amazon/aws-xray-daemon:3.2.0" 替换了图像 "amazon/aws-xray-daemon" -k8s-daemonset.yml".

版本 3.2.0 工作正常,问题已修复

P.S. 目前尚不清楚如何使 xray daemonset 与最新版本的 "amazon/aws-xray-daemon" 镜像一起正常工作。

因此,仍然欢迎其他解决方案。