问题描述
我在 EMR 集群上尝试使用 PySpark 将数据写入 Redshift 时出错。
df.write.format("jdbc") \
.option("url","jdbc:redshift://clustername.yyyyy.us-east-1.redshift.amazonaws.com:5439/db") \
.option("driver","com.amazon.redshift.jdbc42.Driver") \
.option("dbtable","public.table") \
.option("user",user_redshift) \
.option("password",password_redshift) \
.mode("overwrite") \
.save()
我得到的错误是:
py4j.protocol.Py4JJavaError: An error occurred while calling o143.save.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 Failed 4 times,most recent failure: Lost task 0.3 in stage 0.0 (TID 6,executor 1):
java.sql.sqlException: [Amazon](500310) Invalid operation: The session is read-only;
at com.amazon.redshift.client.messages.inbound.ErrorResponse.toErrorException(UnkNown Source)
at com.amazon.redshift.client.pgmessagingContext.handleErrorResponse(UnkNown Source)
at com.amazon.redshift.client.pgmessagingContext.handleMessage(UnkNown Source)
at com.amazon.jdbc.communications.InboundMessagesPipeline.getNextMessageOfClass(UnkNown Source)
at com.amazon.redshift.client.pgmessagingContext.doMovetoNextClass(UnkNown Source)
at com.amazon.redshift.client.pgmessagingContext.getParameterDescription(UnkNown Source)
at com.amazon.redshift.client.PGClient.prepareStatement(UnkNown Source)
at com.amazon.redshift.dataengine.PGQueryExecutor.<init>(UnkNown Source)
at com.amazon.redshift.dataengine.PGDataEngine.prepare(UnkNown Source)
at com.amazon.jdbc.common.SPreparedStatement.<init>(UnkNown Source)
...
感谢您的帮助。谢谢!
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)