amazon-s3 - 使用 3 种方法在 Spark 程序上设置 AWS 凭证，但都不起作用

我正在启动一个使用 S3 作为仓库的 Spark hive-server 集群。我使用 3 种方法冗余地设置了我的 AWS 凭证，即:

$SPARK_HOME/conf 下的 hdfs-site.xml:

<property>
  <name>fs.s3.awsAccessKeyId</name>
  <value>****</value>
</property>

<property>
  <name>fs.s3.awsSecretAccessKey</name>
  <value>****</value>
</property>

通过在start-hivethrift参数中使用spark.executor.extraJavaOptions设置执行器的系统属性:

--conf“spark.executor.extraJavaOptions=-Dfs.s3.awsAccessKeyId=**** -Dfs.s3.awsSecretAccessKey=****”\

在启动 hivethrift 之前设置环境变量。

启动脚本如下所示:

AWS_ACCESS_KEY_ID=**** \
AWS_SECRET_ACCESS_KEY=**** \
$SPARK_HOME/sbin/start-thriftserver.sh \
--conf "spark.executor.extraJavaOptions=-Dfs.s3.awsAccessKeyId=**** -Dfs.s3.awsSecretAccessKey=****" \
--hiveconf hive.metastore.warehouse.dir=s3n://testdata \

但是当我运行任何创建表查询时，我仍然得到:

Error: org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3n URL, or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties (respectively).) (state=,code=0)

这是怎么回事？为什么它们都不像文档中那样工作？

最佳答案

糟糕，我的 hdfs-site.xml 有问题。我应该添加 S3 支持的所有可能的架构名称:

<configuration>

<property>
  <name>fs.s3.awsAccessKeyId</name>
  <value>****</value>
</property>

<property>
  <name>fs.s3.awsSecretAccessKey</name>
  <value>****</value>
</property>

<property>
  <name>fs.s3n.awsAccessKeyId</name>
  <value>****</value>
</property>

<property>
  <name>fs.s3n.awsSecretAccessKey</name>
  <value>****</value>
</property>

<property>
  <name>fs.s3a.awsAccessKeyId</name>
  <value>****</value>
</property>

<property>
  <name>fs.s3a.awsSecretAccessKey</name>
  <value>****</value>
</property>

</configuration>

现在似乎不再有问题了。虽然有点不方便，但我很高兴现在可以使用了。

关于amazon-s3 - 使用 3 种方法在 Spark 程序上设置 AWS 凭证，但都不起作用，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/32030792/

amazon-s3 - 使用 3 种方法在 Spark 程序上设置 AWS 凭证，但都不起作用

上一篇：php - 让 PhpStorm 将我的参数识别为一个类

下一篇：php - 避免在 PHP 中执行部分变量