amazon-web-services - 在 Amazon EMR 中的何处查找 Hive UDF 的控制台输出

标签 amazon-web-services hadoop hive amazon-emr

我正在java中创建一个可以在Hive查询中调用的UDF，在UDF中我放入了System.out.prinln(msg)，希望在控制台中输出一些内容。它在我的本地工作场所按预期工作，但是当部署到亚马逊 EMR 时，stderr 日志文件不会显示我的 UDF 函数的任何输出消息。我在哪里查找包含消息输出的文件？

最佳答案

如果 Hive 将查询提交给 M/R，则任何输出都将捕获在您提交的作业控制台输出中。请参阅Where does hadoop mapreduce framework send my System.out.print() statements ? (stdout) 。对于 M/R 的 EMR 特定风格，请参阅 View Log Files :

Amazon EMR does not automatically archive log files to Amazon S3. You must configure this when you launch the cluster...

When Amazon EMR is configured to archive log files to Amazon S3, it stores the files in the S3 location you specified, in the /JobFlowId/ folder, where JobFlowId is the cluster identifier.

请注意，Hive 还可以在本地运行查询。

关于amazon-web-services - 在 Amazon EMR 中的何处查找 Hive UDF 的控制台输出，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/18614397/

上一篇：java - Flume的一些问题

下一篇：sorting - 执行辅助排序时获取空指针异常

amazon-web-services - EC2 IAM 策略需要标签

join - 如何在 HIVE 中连接两个表。

json - 无法使用JSON-SerDe在Hive中创建表

php - 更好的AWS实例

java - 如何禁用特定 http 请求的证书验证？

hadoop - 为 Spark 集群和 Cassandra 设置和配置 JanusGraph

Java编译不产生.jar

hadoop - hive hadoop 上可用的数据可视化工具

scala - 如何使用其架构从Spark数据框架创建配置单元表？