使用Spark和Java从s3读取文件时,无法连接到服务端点

问题描述

我需要将文件从S3存储桶读取到Spark数据集中。我使用了正确的secretKey和accessKey,并且还尝试了端点配置,但出现此错误:

com.amazonaws.SdkClientException: Failed to connect to service endpoint: 
 at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100)
 at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.getToken(InstanceMetadataServiceResourceFetcher.java:91)

 ... 74 more



java.nio.file.AccessDeniedException: datalakedbr: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint: 

 at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:187)
 at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111)
 at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:265)
 at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)
 at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:261)
Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint: 
 at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:159)

这是使用的方法:

    parkSession sparkSession = SparkSession.builder()
            .master("local").appName("readFile")
            .config("fs.s3a.awsAccessKeyId","key")
            .config("fs.s3a.awsSecretAccessKey","secretKey")
            .getOrCreate();
    JavaSparkContext sparkContext = new JavaSparkContext(sparkSession.sparkContext());
    String path = "s3a://bucket/path.json";
    Dataset<Row> file = sparkSession.sqlContext().read().load(path);

请任何人可以帮助吗?

解决方法

我认为问题出在财产名称上。

在此处查看Hadoop文档: https://hadoop.apache.org/docs/r2.7.2/hadoop-aws/tools/hadoop-aws/index.html

它表示对于S3A,该属性的名称应为 fs.s3a.access.key / fs.s3a.secret.key ,而不是 fs.s3a.awsAccessKeyId / fs.s3a.awsSecretAccessKey

其他选项是用于S3的 fs.s3.awsAccessKeyId ,或用于S3N的 fs.s3n.awsAccessKeyId

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...