spark.driver.extraLibraryPath 覆盖原始库路径

问题描述

我有一个在 AWS EMR 集群上运行的 spark 作业,它需要访问本机 lib(*.so),每个 spark 的文档 (https://spark.apache.org/docs/2.3.0/configuration.html) 我需要添加“spark.driver.extraLibraryPath”和“spark.executor spark-submit 命令行中的 .extraLibraryPath" 选项

spark-submit \
--class test.Clustering \
--conf spark.executor.extraLibraryPath="/opt/test/lib/native" \
--conf spark.driver.extraLibraryPath="/opt/test/lib/native" \
--master yarn \
--deploy-mode client \
s3-etl-prepare-1.0-SNAPSHOT-jar-with-dependencies.jar "$@"

它按我的预期工作,本地库已加载,问题是:在火花作业期间,我需要做一个需要 lzo 本地库的分发 lzo 索引器 MR 作业,lzo 代码无法加载本地 gpl 库:

21/06/16 09:49:09 ERROR GPLNativeCodeLoader: Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860)
        at java.lang.Runtime.loadLibrary0(Runtime.java:870)
        at java.lang.System.loadLibrary(System.java:1124)
        at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
        at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
        at com.hadoop.compression.lzo.DistributedLzoIndexer.<init>(DistributedLzoIndexer.java:28)
        at test.misc.FileHelper.distributIndexLzoFile(FileHelper.scala:260)
        at test.scalaapp.Clustering$.main(Clustering.scala:66)
        at test.scalaapp.Clustering.main(Clustering.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:853)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:928)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:937)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

似乎“spark.driver.extraLibraryPath”选项覆盖或更改整个库路径而不是附加一个新路径,如何同时保留 gpl lzo 本机路径和我自己的库路径?

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...