java - S3错误线程中的异常 "main"java.lang.UnsatisfiedLinkError : org. apache.hadoop.io.nativeio.NativeIO$Windows.access0

标签 java amazon-web-services hadoop amazon-s3 parquet

我使用了以下依赖项:

<properties>
    <javac-source.version>1.8</javac-source.version>
    <javac-target.version>1.8</javac-target.version>
    <hadoop.version>3.2.1</hadoop.version>
</properties>

<dependencies>
    <dependency>
        <groupId>org.apache.avro</groupId>
        <artifactId>avro</artifactId>
        <version>1.8.2</version>
    </dependency>

    <!-- Thanks for using https://jar-download.com -->

    <dependency>
        <groupId>org.apache.parquet</groupId>
        <artifactId>parquet-hadoop</artifactId>
        <version>1.8.1</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.parquet/parquet-avro -->
    <dependency>
        <groupId>org.apache.parquet</groupId>
        <artifactId>parquet-avro</artifactId>
        <version>1.8.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>${hadoop.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-azure-datalake</artifactId>
        <version>${hadoop.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-azure</artifactId>
        <version>${hadoop.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-aws</artifactId>
        <version>${hadoop.version}</version>
    </dependency>

    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk-bundle</artifactId>
        <version>1.11.762</version>
    </dependency>

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-core</artifactId>
        <version>1.2.1</version>
    </dependency>
</dependencies>

设置以下配置:

    Configuration conf = new Configuration();
    conf.set("fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem");
    conf.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem");
    conf.set("fs.s3a.access.key","key_id");
    conf.set("fs.s3a.secret.key","key");
    conf.set("fs.s3a.endpoint","s3.us-east-1.amazonaws.com");
    conf.set("fs.s3a.aws.credentials.provider",org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider.NAME);
    conf.set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem.class.getName());

使用以下路径:

new Path("s3a://bucket_name/" + filename)

我也尝试设置HADOOP_HOME,但没有帮助

我总是收到以下错误:

20/04/15 17:55:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/04/15 17:55:13 WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
20/04/15 17:55:13 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
20/04/15 17:55:13 INFO impl.MetricsSystemImpl: s3a-file-system metrics system started
Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:645)
at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1230)
at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:160)
at org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:100)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77)
at org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:331)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:394)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:477)
at org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite(LocalDirAllocator.java:213)
at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:589)
at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:811)
at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:190)
at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.<init>(S3ABlockOutputStream.java:168)
at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:822)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1118)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1098)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:987)
at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:223)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:266)
at org.apache.parquet.hadoop.ParquetWriter$Builder.build(ParquetWriter.java:489)
at Main.main(Main.java:43)

与 s3n 相同的凭据适用于 2.8.2 版本的 hadoop。但知道 3.2.1 s3n 已被弃用。

最佳答案

当尝试使用 Hadoop 的 s3a 客户端写入 S3 时,本地文件系统用于创建临时文件。为了能够使用底层文件系统,必须添加基于操作系统的 native 支持。

此异常说明相同,

Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z

这是 Windows 操作系统,您需要 winutils .

  1. 根据您的 Hadoop 版本从 here 下载 winutils.exe 二进制文件和其他所需文件.
  2. 设置环境变量 %HADOOP_HOME% 以指向安装这些二进制文件的目录。

了解更多: Hadoop2 - Windows Problems

关于java - S3错误线程中的异常 "main"java.lang.UnsatisfiedLinkError : org. apache.hadoop.io.nativeio.NativeIO$Windows.access0,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61232117/

相关文章:

hadoop - 启动 Pig 时出错

java - 无法使用 JAVA 创建表并将表列表到远程 HBase

amazon-web-services - 阻止下载的 Amazon S3 存储桶策略?

r - 将数据帧保存到 hdfs 后,当我在使用 rhdfs 读回它时尝试反序列化它时出现错误

Java 套接字 : Program stops at socket. getInputStream() 没有错误?

java - 使用 Java DSL 更改 CSV 输出分隔符

java - java中查找文件夹中的特定文件

java - Eclipse插件: Inactivate functionality when testing

mongodb - 我在哪里可以获得 DocumentDB 集群的确切使用存储

amazon-web-services - AWS Lambda : org. apache.http.conn.ConnectTimeoutException