hadoop - Apache Shark 0.9.1 无法连接到 HDFS?

标签 hadoop apache-spark shark-sql

在 Shark 中,当我运行时:

CREATE EXTERNAL TABLE test (
  memberId STRING,
  category STRING,
  message STRING,
  source STRING,
  event_type STRING,
  log_level STRING,
  path STRING,
  host STRING,
  event_timestamp STRING,
  eventFields MAP<STRING,STRING>
)
PARTITIONED BY (datePart STRING)
ROW FORMAT SERDE 'com.company.eventserde.EventSerde'
LOCATION '/user/ubuntu/test';

我得到:

[Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
Time taken (including network latency): 0.05 seconds

错误日志显示:

35.526: [Full GC 112196K->28191K(1013632K), 0.1913800 secs]
FAILED: Error in metadata: MetaException(message:file:/user/ubuntu/events is not a directory or unable to create one)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
FAILED: Error in metadata: MetaException(message:file:/user/ubuntu/test is not a directory or unable to create one)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

有谁知道为什么 Shark 不在 Hadoop 中创建表?

最佳答案

尝试为位置指定完整的 hdfs URI,如下所示:

LOCATION 'hdfs://<NAMENODE-IP>:<NAMENODE-IPC-PORT>/user/ubuntu/test'; 

关于hadoop - Apache Shark 0.9.1 无法连接到 HDFS?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23225631/

相关文章:

scala - 使用 Spark sc.textFile 读取文件时如何捕获 BlockMissingException?

java - HDFS Datanode 因 OutOfMemoryError 崩溃

json - 将多个 JSON 记录从一个文件加载到 HIVE

apache-spark - Spark - 将 kafka 流式传输到每天都在变化的文件?

scala - 为什么 mapPartitions 不向标准输出打印任何内容?

hadoop - 在速度,鲨鱼或 Spark 方面哪个更好

Hadoop:只使用 2 台机器的缺点?

hadoop - 用pyspark将图像写为序列文件的值

apache-spark - Oozie Spark(2.x)操作始终处于接受状态