问题描述
我遇到了直接运行器的张量流数据验证问题,无法从一些400GB以上的大型数据集中生成统计信息。 似乎所有员工在发出“ Keepalive看门狗被解雇”错误消息后都停止了工作。关闭运输。”这似乎是 grpc 保持活动超时。
E0804 17:49:07.419950276 44806 chttp2_transport.cc:2881] ipv6:[::1]:40823: Keepalive watchdog fired. Closing transport.
2020-08-04 17:49:07 local_job_service.py : INFO Worker: severity: ERROR timestamp { seconds: 1596563347 nanos: 420487403 } message: "Python sdk harness Failed: \nTraceback (most recent call last):\n File \"/home/ec2-user/lib64/python3.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py\",line 158,in main\n sdk_pipeline_options.view_as(ProfilingOptions))).run()\n File \"/home/ec2-user/lib64/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py\",line 213,in run\n for work_request in self._control_stub.Control(get_responses()):\n File \"/home/ec2-user/lib64/python3.7/site-packages/grpc/_channel.py\",line 416,in __next__\n return self._next()\n File \"/home/ec2-user/lib64/python3.7/site-packages/grpc/_channel.py\",line 706,in _next\n raise self\ngrpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:\n\tstatus = StatusCode.UNAVAILABLE\n\tdetails = \"keepalive watchdog timeout\"\n\tdebug_error_string = \"{\"created\":\"@1596563347.420024732\",\"description\":\"Error received from peer ipv6:[::1]:40823\",\"file\":\"src/core/lib/surface/call.cc\",\"file_line\":1055,\"grpc_message\":\"keepalive watchdog timeout\",\"grpc_status\":14}\"\n>" trace: "Traceback (most recent call last):\n File \"/home/ec2-user/lib64/python3.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py\",\"grpc_status\":14}\"\n>\n" log_location: "/home/ec2-user/lib64/python3.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py:161" thread: "MainThread"
Traceback (most recent call last):
File "/usr/lib64/python3.7/runpy.py",line 193,in _run_module_as_main
"__main__",mod_spec)
File "/usr/lib64/python3.7/runpy.py",line 85,in _run_code
exec(code,run_globalse
File "/home/ec2-user/lib64/python3.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py",line 248,in <module>
main(sys.argv)
File "/home/ec2-user/lib64/python3.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py",in main
sdk_pipeline_options.view_as(ProfilingOptions))).run()
File "/home/ec2-user/lib64/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",in run
for work_request in self._control_stub.Control(get_responses()):
File "/home/ec2-user/lib64/python3.7/site-packages/grpc/_channel.py",in __next__
return self._next()
File "/home/ec2-user/lib64/python3.7/site-packages/grpc/_channel.py",in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "keepalive watchdog timeout"
debug_error_string = "{"created":"@1596563347.420024732","description":"Error received from peer ipv6:[::1]:40823","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"keepalive watchdog timeout","grpc_status":14}"
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)