问题描述
在我的k8s集群中,重新启动节点后filebeat连接失败。其他k8s节点正常工作。
filebeat窗格中的日志:
remote: + local LOCK_TYPE=exclusive
remote: + local APP_DEPLOY_LOCK_FILE=/home/dokku/example/.deploy.lock
remote: + local 'LOCK_WAITING_MSG=example currently has a deploy lock in place. Waiting...'
remote: + local 'LOCK_Failed_MSG=example currently has a deploy lock in place. Exiting...'
remote: + acquire_advisory_lock /home/dokku/example/.deploy.lock exclusive 'dash-stroom-app-v1 currently has a deploy lock in place. Waiting...' 'example currently has a deploy lock in place. Exiting...'
remote: + declare 'desc=acquire advisory lock'
remote: + local LOCK_FILE=/home/dokku/example/.deploy.lock LOCK_TYPE=exclusive 'LOCK_WAITING_MSG=example currently has a deploy lock in place. Waiting...' 'LOCK_Failed_MSG=example currently has a deploy lock in place. Exiting...'
remote: + local LOCK_FD=200
remote: + local SHOW_MSG=true
remote: + eval 'exec 200>/home/dokku/example/.deploy.lock'
remote: ++ exec
remote: + [[ exclusive == \w\a\i\t\i\n\g ]]
remote: + flock -n 200
remote: + dokku_log_fail 'example currently has a deploy lock in place. Exiting...'
remote: + declare 'desc=log fail formatter'
remote: + echo 'example currently has a deploy lock in place. Exiting...'
remote: example currently has a deploy lock in place. Exiting...
remote: + exit 1
remote: + exit_code=1
remote: + set -e
remote: + [[ 1 -eq 10 ]]
remote: + implemented=1
remote: + [[ 1 -ne 0 ]]
remote: + exit 1
To https://example.com/GIT/example
! [remote rejected] master -> master (pre-receive hook declined)
error: Failed to push some refs to 'https://example.com/GIT/example'
发生错误,并且Pod重新启动。我也重新启动了这个节点,但是没有用。
filebeat版本为6.5.2,并使用守护程序进行部署。有这样的已知问题吗?
除filebeat以外的所有Pod都可以在该节点上正常工作。
更新:
2020-08-30T03:18:58.770Z ERROR kubernetes/util.go:90 kubernetes: Querying for pod Failed with error: performing request: Get https://10.96.0.1:443/api/v1/namespaces/monitoring/pods/filebeat-gfg5l: dial tcp 10.96.0.1:443: I/O timeout
2020-08-30T03:18:58.770Z INFO kubernetes/watcher.go:180 kubernetes: Performing a resource sync for *v1.PodList
2020-08-30T03:19:28.771Z ERROR kubernetes/watcher.go:183 kubernetes: Performing a resource sync err performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: I/O timeout for *v1.PodList
2020-08-30T03:19:28.771Z INFO instance/beat.go:357 filebeat stopped.
2020-08-30T03:19:28.771Z ERROR instance/beat.go:800 Exiting: error initializing publisher: error initializing processors: performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: I/O timeout
Exiting: error initializing publisher: error initializing processors: performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: I/O timeout
解决方法
add_kubernetes_metadata
无法查询https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0
。正如上面的讨论所证明的那样,可以通过重新启动Beat解决临时网络接口问题来解决此问题。