问题描述
TL;博士:
如何在独立集群中提交 Spark 作业时修复 java.lang.IllegalStateException: Cannot find any build directories.
错误。
我使用 sbt-native-packager
在 docker 镜像中打包了一个 Spark 应用程序。
这会生成一个包含所有必需 jar 的图像:
docker run --rm -it --entrypoint ls myimage:latest -l lib
total 199464
[...]
-r--r--r-- 1 demiourgos728 root 3354982 Oct 2 2016 org.apache.hadoop.hadoop-common-2.6.5.jar
[...]
-r--r--r-- 1 demiourgos728 root 8667550 Sep 8 2020 org.apache.spark.spark-core_2.12-2.4.7.jar
[...]
-r--r--r-- 1 demiourgos728 root 5276900 Sep 10 2019 org.scala-lang.scala-library-2.12.10.jar
[...]
然后,我使用 docker-compose 设置了一个独立的集群:
version: '3'
services:
spark-driver:
image: myimage:latest
ports:
- "8080:8080"
command: [
"-main","org.apache.spark.deploy.master.Master"
]
spark-worker:
image: myimage:latest
ports:
- "8081:8081"
depends_on:
- spark-driver
command: [
"-main","org.apache.spark.deploy.worker.Worker","spark-driver:7077","--work-dir","/tmp/spark_work"
]
app:
image: myimage:latest
ports:
- "4040:4040"
environment:
SPARK_HOME: "/opt/docker"
depends_on:
- spark-worker
command: [
"-main","org.apache.spark.deploy.SparkSubmit","--master","spark://spark-driver:7077","--class","io.dummy.MyClass","/opt/docker/lib/io.dummy.mypackage.jar"
]
运行 spark-driver
和一些 spark-worker
的工作(工人注册到驱动程序等)。
但是,在启动我的应用程序时,它不断失败并出现此类错误:
[o.a.s.d.c.StandaloneAppClient$ClientEndpoint] Executor added: app-20210511122036-0003/0 on worker-20210511114945-172.23.0.3-46727 (172.23.0.3:46727) with 8 core(s)
[o.a.s.s.c.StandaloneSchedulerBackend] Granted executor ID app-20210511122036-0003/0 on hostPort 172.23.0.3:46727 with 8 core(s),1024.0 MB RAM
[o.a.s.d.c.StandaloneAppClient$ClientEndpoint] Executor added: app-20210511122036-0003/1 on worker-20210511114945-172.23.0.4-40043 (172.23.0.4:40043) with 8 core(s)
[o.a.s.s.c.StandaloneSchedulerBackend] Granted executor ID app-20210511122036-0003/1 on hostPort 172.23.0.4:40043 with 8 core(s),1024.0 MB RAM
[o.a.s.d.c.StandaloneAppClient$ClientEndpoint] Executor updated: app-20210511122036-0003/0 is now RUNNING
[o.a.s.d.c.StandaloneAppClient$ClientEndpoint] Executor updated: app-20210511122036-0003/1 is now RUNNING
[o.a.s.s.BlockManagerMaster] Registering BlockManager BlockManagerId(driver,e484e8deb590,41285,None)
[o.a.s.d.c.StandaloneAppClient$ClientEndpoint] Executor updated: app-20210511122036-0003/0 is now FAILED (java.lang.IllegalStateException: Cannot find any build directories.)
[o.a.s.s.c.StandaloneSchedulerBackend] Executor app-20210511122036-0003/0 removed: java.lang.IllegalStateException: Cannot find any build directories.
[o.a.s.s.BlockManagerMasterEndpoint] Registering block manager e484e8deb590:41285 with 1917.3 MB RAM,BlockManagerId(driver,None)
[o.a.s.s.BlockManagerMaster] Removal of executor 0 requested
相关部分似乎是:java.lang.IllegalStateException: Cannot find any build directories.
从不同的 SO 帖子来看,它似乎与 SPARK_HOME
环境变量或 scala
库版本不匹配有关...
然而:
- 我尝试使用不同的
SPARK_HOME
值(无、/tmp、/opt/docker),但没有任何改变。 - 关于scala,镜像中没有安装scala二进制文件,但是类路径中有scala-library jar。
这是怎么回事?如何解决这个问题?
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)