hadoop - Flink 在 YARN : Amazon S3 wrongly used instead of HDFS 上

标签 hadoop amazon-s3 hadoop-yarn apache-flink flink-cep

我关注了Flink on YARN's setup documentation .但是,当我使用 ./bin/yarn-session.sh -n 2 -jm 1024 -tm 2048 运行时,在向 Kerberos 进行身份验证时,出现以下错误:

2016-06-16 17:46:47,760 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-06-16 17:46:48,518 INFO  org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl     - Timeline service address: https://**host**:8190/ws/v1/timeline/
2016-06-16 17:46:48,814 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Using values:
2016-06-16 17:46:48,815 INFO  org.apache.flink.yarn.FlinkYarnClient                         -   TaskManager count = 2
2016-06-16 17:46:48,815 INFO  org.apache.flink.yarn.FlinkYarnClient                         -   JobManager memory = 1024
2016-06-16 17:46:48,815 INFO  org.apache.flink.yarn.FlinkYarnClient                         -   TaskManager memory = 2048
Exception in thread "main" java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.s3a.S3AFileSystem could not be instantiated
    at java.util.ServiceLoader.fail(ServiceLoader.java:224)
    at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
    at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
    at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
    at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2623)
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2634)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
    at org.apache.flink.yarn.FlinkYarnClientBase.deployInternal(FlinkYarnClientBase.java:531)
    at org.apache.flink.yarn.FlinkYarnClientBase$1.run(FlinkYarnClientBase.java:342)
    at org.apache.flink.yarn.FlinkYarnClientBase$1.run(FlinkYarnClientBase.java:339)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.flink.yarn.FlinkYarnClientBase.deploy(FlinkYarnClientBase.java:339)
    at org.apache.flink.client.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:419)
    at org.apache.flink.client.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:362)
Caused by: java.lang.NoClassDefFoundError: com/amazonaws/AmazonServiceException
    at java.lang.Class.getDeclaredConstructors0(Native Method)
    at java.lang.Class.privateGetDeclaredConstructors(Class.java:2532)
    at java.lang.Class.getConstructor0(Class.java:2842)
    at java.lang.Class.newInstance(Class.java:345)
    at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
    ... 18 more
Caused by: java.lang.ClassNotFoundException: com.amazonaws.AmazonServiceException
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 23 more

我在我的 ./flink-1.0.3/conf/flink-conf.yaml 中设置了以下属性

fs.hdfs.hadoopconf: /etc/hadoop/conf/
fs.hdfs.hdfssite: /etc/hadoop/conf/hdfs-site.xml

如何使用 HDFS 而不是 Amazon 的 S3?

谢谢。

最佳答案

我想问题是 Flink 没有获取您的配置文件。

能否从配置中删除以 fs.hdfs.hdfssite 开头的行。如果设置了 fs.hdfs.hadoopconf,则不需要。

此外,您能否检查 core-site.xmlfs.defaultFs 的设置是否设置为以 hdfs:// ?

关于hadoop - Flink 在 YARN : Amazon S3 wrongly used instead of HDFS 上,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37864969/

相关文章:

hadoop - 我们可以使用sqoop将数据从hadoop导出到csv吗

mysql - 仅在 hive 中过滤工作日和周末

java - Java线程未在JDBC程序中进行

Python:Amazon S3 无法获取存储桶:表示 403 Forbidden

Ruby AWS::S3::S3Object (aws-sdk):是否有与 aws-s3 一样的流式数据方法?

hadoop - cloudera director客户端安装

javascript - 不推荐访问存储桶虚拟托管样式的 URL? (AWS S3)

hadoop - Hadoop 2.x 中的默认 block 大小

shell - 在 yarn 客户端错误上运行 spark shell

apache-spark - Spark 2.0 : spark-infotheoretic-feature-selection java. lang.NoSuchMethodError : breeze. linalg.DenseMatrix