hadoop - 将JSON文件转换为AVRO时出错

标签 hadoop serialization mapreduce avro

我正在AVRO网站上按照说明进行操作,并按如下方式创建json和模式文件(均在文本文件中):

JSON文件

{"name": "user", "favorite_number": null, "favorite_color": "red"}
{"name": "user", "favorite_number": null, "favorite_color": "green"}
{"name": "user", "favorite_number": null, "favorite_color": "purple"}
{"name": "user", "favorite_number": null, "favorite_color": null}

和架构文件:
{"namespace": "example.avro",
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "favorite_number",  "type": ["int", "null"]},
{"name":"favorite_color", "type": ["string", "null"]}
]
}

当我尝试使用avro-tools jar文件创建avro文件时,出现以下错误消息:
Exception in thread "main" org.apache.avro.AvroTypeException: Expected start-uni
on. Got VALUE_STRING
    at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697)
    at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441)
    at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:29
0)
    at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
    at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:2
67)
    at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.ja
va:155)
    at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumRead
er.java:193)
    at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumRea
der.java:183)
    at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.ja
va:151)
    at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.ja
va:142)
    at org.apache.avro.tool.DataFileWriteTool.run(DataFileWriteTool.java:99)

    at org.apache.avro.tool.Main.run(Main.java:84)
    at org.apache.avro.tool.Main.main(Main.java:73)

有人可以帮我解决这个问题。我做错了什么?

最佳答案

如下所示,更正JSON输入的前三行,然后尝试。

{"name": "user", "favorite_number": null, "favorite_color":{"string": "red"}}
{"name": "user", "favorite_number": null, "favorite_color":{"string": "green"}}
{"name": "user", "favorite_number": null, "favorite_color":{"string":"purple"}}

关于hadoop - 将JSON文件转换为AVRO时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28774933/

相关文章:

c# - 如果我在 c# 中序列化一个类,Unrealscript 可以读回它吗?

hadoop - pig : Force one mapper per input line/row

java - hadoop - 在多个集群上映射减少

hadoop - 给两个任务同名是否会引起问题

Hadoop : Permission denied (publickey, 密码,键盘交互)

hadoop - 在 hadoop 中获取推特数据

java - 从类函数反序列化java对象

hadoop - 通过 Knox 获取与 Hive 的 JDBC 连接时出错

php - Jquery .serialize() 不处理下拉列表的值?

hadoop - map() 函数的调用次数与 MR Job 发出的 map 任务数之间的关系