无法在Kubernetes POD上部署Spark历史记录服务器

问题描述

我正在尝试在kubernetes POD上部署Spark历史记录服务器。为此,我使用了以下命令集:-

helm repo add stable https://kubernetes-charts.storage.googleapis.com
helm install stable/spark-history-server --generate-name

但是,这样做时,我遇到了问题,下面是错误日志:-

Events:
  Type     Reason       Age                      From                               Message
  ----     ------       ----                     ----                               -------
  Warning  FailedMount  7m51s (x129 over 3h31m)  kubelet,aks-agentpool-20240184-1  (combined from similar events): MountVolume.SetUp Failed for volume "nfs-pv" : mount Failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/2bc91c0b-a9e8-4af6-9a6a-8e4781079afb/volumes/kubernetes.io~nfs/nfs-pv --scope -- mount -t nfs spark-history-server-1599813147-nfs.default.svc.cluster.local:/ /var/lib/kubelet/pods/2bc91c0b-a9e8-4af6-9a6a-8e4781079afb/volumes/kubernetes.io~nfs/nfs-pv
Output: Running scope as unit run-re958022a7250453abcd26d58efcbf360.scope.
mount.nfs: Failed to resolve server spark-history-server-1599813147-nfs.default.svc.cluster.local: Name or service not kNown
  Warning  FailedMount  2m51s (x17 over 3h31m)  kubelet,aks-agentpool-20240184-1  Unable to attach or mount volumes: unmounted volumes=[data],unattached volumes=[spark-history-server-1599813147-token-bglz7 data]: timed out waiting for the condition

任何帮助将不胜感激!

解决方法

很遗憾,这是known issues之一:

安装Kubernetes不会将节点的resolv.conf文件配置为 默认情况下使用群集DNS,因为该过程本质上是 特定于分布。这可能最终应该实现。

有一些解决方法,但是您可以选择:

  • ClusterIP(而不是域名)指定时,NFS成功安装。您可以找到示例here

  • 在每个节点上手动更新resolv.conv

  • 在所有节点上的/etc/hosts中手动写入服务名称。