YARN无法识别增加的'yarn.scheduler.maximum-allocation-mb'和'yarn.nodemanager.resource.memory-mb'值

问题描述

我正在使用利用纱线的dockerized pyspark集群。为了提高数据处理管道的效率，我想增加分配给pyspark执行程序和驱动程序的内存量。

这是通过将以下两个键值对添加到REST post请求中来完成的，该请求被发送到Livy： "driverMemory": "20g" "executorMemory": "56g"

这样做会导致以下错误，我在Livy的日志中发现了这些错误：java.lang.IllegalArgumentException: required executor memory (57344),overhead (5734 MB),and PySpark memory (0 MB) is above the max threshold (8192 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.

当然，我已经适当地编辑了yarn-site.xml并通过在文件 and it looks like this中包含以下行来将上述两个值都设置为64 GB，但这似乎没有什么不同。

如果executorMemory + 10％的开销超过8192 MB，则使用不同的driverMemory和executorMemory值会发生类似的问题。

如何解决此问题并分配更多的执行者内存？

解决方法

在启动服务时，请确保您的yarn.site在主容器和工作容器上的外观完全相同。

似乎您可能只在母版上进行了编辑，这可能是造成这种混乱的原因。根据一般经验，在群集中的所有计算机上，所有配置文件（以及许多其他东西）必须看起来完全相同。

apache-spark hive livy yarn