hadoop map-reduce : how to deploy non-jar files

您好，当我使用 hadoop jar ..args.. 提交我的 jar 以进行 map-reduce 作业时，我想知道如何部署非 jar 文件。

对于 hadoop 流，有 --file 选项来发送文件，对于 spark，我们有 --files 但我在文档中找不到这样的选项。

在提交 hadoop map-reduce 作业时，是否可以将非 jar 文件与我的 jar 一起发送？

最佳答案

Applications can specify a comma separated list of paths which would be present in the current working directory of the task using the option -files

The -libjars option allows applications to add jars to the classpaths of the maps and reduces. The option -archives allows them to pass comma separated list of archives as arguments. These archives are unarchived and a link with name of the archive is created in the current working directory of tasks. More details about the command line options are available at Commands Guide.

Running wordcount example with -libjars, -files and -archives: hadoop jar hadoop-examples.jar wordcount -files cachefile.txt -libjars mylib.jar -archives myarchive.zip input output Here, myarchive.zip will be placed and unzipped into a directory by the name "myarchive.zip".

Users can specify a different symbolic name for files and archives passed through -files and -archives option, using #.

For example, hadoop jar hadoop-examples.jar wordcount -files dir1/dict.txt#dict1,dir2/dict.txt#dict2 -archives mytar.tgz#tgzdir input output Here, the files dir1/dict.txt and dir2/dict.txt can be accessed by tasks using the symbolic names dict1 and dict2 respectively. The archive mytar.tgz will be placed and unarchived into a directory by the name "tgzdir".

关于hadoop map-reduce : how to deploy non-jar files，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/38363700/

hadoop map-reduce : how to deploy non-jar files

上一篇：java - 交换键和值映射器 hadoop

下一篇：java - Hadoop NodeManager数量和DataNodes数量关系