hadoop - 使用 CACHE_THROUGH 将数据写入 alluxio 失败

标签 hadoop caching mapreduce in-memory alluxio

我正在尝试使用 map reduce 将数据写入 alluxio。我在 hdfs 上有大约 11 g 的数据,我正在写到 alluxio。它在 MUST_CACHE 写入类型(alluxio.user.file.writetype.default 的默认值)下工作正常。

但是当我尝试使用 CACHE_THROUGH 编写它时,它失败并出现以下异常:

   Error: alluxio.exception.status.UnavailableException: Channel to <hostname of one of the  worker>:29999: <underfs path to file> (No such file or directory)
            at alluxio.client.block.stream.NettyPacketWriter.close(NettyPacketWriter.java:263)
            at com.google.common.io.Closer.close(Closer.java:206)
            at alluxio.client.block.stream.BlockOutStream.close(BlockOutStream.java:166)
            at alluxio.client.file.FileOutStream.close(FileOutStream.java:137)
            at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
            at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
            at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:111)
            at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:679)
            at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:802)
            at org.apache.hadoop.mapred.MapTask.run(MapTask.java:346)
            at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:422)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
            at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
    Caused by: alluxio.exception.status.NotFoundException: Channel to <hostname of one of the  worker>29999: <underfs path to file> (No such file or directory)
            at alluxio.exception.status.AlluxioStatusException.from(AlluxioStatusException.java:153)
            at alluxio.util.CommonUtils.unwrapResponseFrom(CommonUtils.java:548)
            at alluxio.client.block.stream.NettyPacketWriter$PacketWriteHandler.channelRead(NettyPacketWriter.java:367)
            at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
            at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
            at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
            at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
            at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
            at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
            at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
            at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
            at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
            at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
            at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
            at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
            at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
            at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
            at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
            at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
            at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
            at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
            at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
            at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
            at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
            at java.lang.Thread.run(Thread.java:748)

我也试过下面的命令,得到同样的错误:

./alluxio fs -Dalluxio.user.file.writetype.default=CACHE_THROUGH copyFromLocal <hdfs_input_path> <alluxio_output_path>

任何帮助/指点将不胜感激。谢谢

最佳答案

copyFromLocal shell 命令只能复制本地文件系统上可用的文件。要将文件从 HDFS 复制到 Alluxio 中,您可以先将文件复制到本地机器,然后将文件写入 Alluxio。

hdfs dfs -get <hdfs_input_path> /tmp/tmp_file
alluxio fs copyFromLocal /tmp/tmp_file <alluxio_output_path>

要直接从 mapreduce 写入 Alluxio,请将您的 core-site.xml 更新为 包含

<property>
  <name>fs.alluxio.impl</name>
  <value>alluxio.hadoop.FileSystem</value>
  <description>The Alluxio FileSystem (Hadoop 1.x and 2.x)</description>
</property>
<property>
  <name>fs.AbstractFileSystem.alluxio.impl</name>
  <value>alluxio.hadoop.AlluxioFileSystem</value>
  <description>The Alluxio AbstractFileSystem (Hadoop 2.x)</description>
</property>

,使用-libjars/path/to/client将Alluxio客户端jar添加到你的应用类路径,并写入alluxio://master_hostname:19998/alluxio_output_path网址。参见 the documentation了解更多详情。

关于hadoop - 使用 CACHE_THROUGH 将数据写入 alluxio 失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47687134/

相关文章:

apache-spark - SparkSQL 列查询不显示列内容?

hadoop - 如果映射器在中途失败并且 Hadoop 重试该映射器,自定义计数器会发生什么

java - 独立模式下的错误:权限被拒绝

hadoop - 分析MapReduce作业

c# - ASP 中的 Cache.NoSlidingExpiration - 何时重置?

java - 尝试理解基本的 WordCount MapReduce 示例

hadoop - 执行 hadoop namenode -format

hadoop - 使用 DistributedCache 访问 MapFile 时出现 FileNotFoundException

caching - 在Symfony2/Doctrine中清除查询缓存

javascript - Apollo GraphQL 合并缓存数据