java - 本地运行的 Hive 包含 LZO 的 native 库

标签 java hadoop hive lzo

我正在尝试在 OSX Mountain Lion 上本地运行 Hive,并尝试按照此处的说明进行操作:

https://github.com/twitter/hadoop-lzo

我已经编译了 native OSX 库和 jar,但我不确定应该如何在本地启动 Hive,以便 Hive/Hadoop 使用 native 库。

我尝试通过 JAVA_LIBRARY_PATH 环境变量包含它,但我认为这仅适用于 Hadoop。

export JAVA_LIBRARY_PATH="${SCRIPTS_DIR}/jars/native/Mac_OS_X-x86_64-64"

当我使用 LzopCodec 运行 hive 时,例如:

SET mapred.output.compression.codec = com.hadoop.compression.lzo.LzopCodec;

当我运行运行 Map/Reduce 作业的查询时,出现以下错误:

SELECT COUNT(*) from test_table;


Job running in-process (local Hadoop)
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: native-lzo library not available
        at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:237)
        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:477)
        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:525)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
        at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:959)
        at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:995)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
        at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:303)
        at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:262)
Caused by: java.lang.RuntimeException: native-lzo library not available
        at com.hadoop.compression.lzo.LzoCodec.getCompressorType(LzoCodec.java:155)
        at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:100)
        at com.hadoop.compression.lzo.LzopCodec.getCompressor(LzopCodec.java:135)
        at com.hadoop.compression.lzo.LzopCodec.createOutputStream(LzopCodec.java:70)
        at org.apache.hadoop.hive.ql.exec.Utilities.createCompressedStream(Utilities.java:868)
        at org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:80)
        at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:246)
        at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:234)
        ... 14 more

我还尝试在 Hive 脚本中设置 mapred.child.env LD_LIBRARY_PATH (没有运气):

SET mapred.child.env="LD_LIBRARY_PATH=../../scripts/jars/native/Mac_OS_X-x86_64-64";

最佳答案

再次阅读清晰的说明:

How do I configure Hadoop to use these classes?

# Copy the native library
tar -cBf - -C build/hadoop-gpl-compression-0.1.0-dev/lib/native . | tar -xBvf - -C /path/to/hadoop/dist/lib/native

基本上,我只需要将构建的 native 库复制到我的 hadoop 安装中:

ant compile-native tar
cp -r build/hadoop-lzo-0.4.17-SNAPSHOT/lib/native/Mac_OS_X-x86_64-64 /usr/local/Cellar/hadoop/1.1.2/libexec/lib/native/

关于java - 本地运行的 Hive 包含 LZO 的 native 库,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15375316/

相关文章:

java - 使用 XPath 和 JDOM/JAXEN/SAXON 搜索 XML

java - 将消息异步放在WebSphere MQ队列上

hadoop - 在不同硬件机器的 VM 之间安装 hadoop 集群

java - 在 HBase 中存储和更新 Set 的最佳方式是什么?

java - 配置查询特定情况

java - 如何在给定的搜索中加载相应的数据

java - jpa 多对多与附加列

hadoop - 当我们在HIVE中添加一个 jar 时会发生什么?

arrays - 如何使用 json 对象数组创建外部配置单元表

hadoop - 在 cassandra 集群上使用 hive 映射减少