问题描述
我有一个基于带有kafka的spark结构流3的应用程序,它正在处理一些用户日志,一段时间后,驱动程序开始杀死执行程序,我不明白为什么。 执行程序不包含任何错误。我把执行者和驱动程序的日志留在下面
在执行器1上:
0/08/31 10:01:31 INFO executor.Executor: Finished task 5.0 in stage 791.0 (TID 46411). 1759 bytes result sent to driver
20/08/31 10:01:33 INFO executor.YarnCoarseGrainedExecutorBackend: Driver commanded a shutdown
在执行器2上:
20/08/31 10:14:33 INFO executor.YarnCoarseGrainedExecutorBackend: Driver commanded a shutdown
20/08/31 10:14:34 INFO memory.MemoryStore: MemoryStore cleared
20/08/31 10:14:34 INFO storage.BlockManager: BlockManager stopped
20/08/31 10:14:34 INFO util.ShutdownHookManager: Shutdown hook called
在驱动程序上:
20/08/31 10:01:33 ERROR cluster.YarnScheduler: Lost executor 3 on xxx.xxx.xxx.xxx: Executor heartbeat timed out after 130392 ms
20/08/31 10:53:33 ERROR cluster.YarnScheduler: Lost executor 2 on xxx.xxx.xxx.xxx: Executor heartbeat timed out after 125773 ms
20/08/31 10:53:33 ERROR cluster.YarnScheduler: Ignoring update with state FINISHED for TID 129308 because its task set is gone (this is likely the result of receiving duplicate task finished status updates) or its executor has been marked as Failed.
20/08/31 10:53:33 ERROR cluster.YarnScheduler: Ignoring update with state FINISHED for TID 129314 because its task set is gone (this is likely the result of receiving duplicate task finished status updates) or its executor has been marked as Failed.
20/08/31 10:53:33 ERROR cluster.YarnScheduler: Ignoring update with state FINISHED for TID 129311 because its task set is gone (this is likely the result of receiving duplicate task finished status updates) or its executor has been marked as Failed.
20/08/31 10:53:33 ERROR cluster.YarnScheduler: Ignoring update with state FINISHED for TID 129305 because its task set is gone (this is likely the result of receiving duplicate task finished status updates) or its executor has been marked as Failed.
有没有人遇到过同样的问题并解决了?
解决方法
查看手边的可用信息:
- 没有错误
- 驱动程序命令关闭
- 显示“状态已完成”的纱线日志
这似乎是预期的行为。
如果您忘记等待Spark Streaming查询终止,通常会发生这种情况。如果您没有使用
结束代码query.awaitTermination()
处理完所有数据后,流媒体应用程序将立即关闭。