hadoop - nifi ConvertRecord JSON 到 CSV 只得到一条记录？

我设置了以下流程来读取 json 数据并使用 convertRecord 处理器将其转换为 csv。但是，输出流文件仅填充了单个记录(我假设只有第一条记录)而不是所有记录。

谁能帮忙提供正确的配置？

源 json 数据:

{"creation_Date": "2018-08-19", "Hour_of_day": 7, "log_count": 2136}
{"creation_Date": "2018-08-19", "Hour_of_day": 17, "log_count": 606}
{"creation_Date": "2018-08-19", "Hour_of_day": 14, "log_count": 1328}
{"creation_Date": "2018-08-19", "Hour_of_day": 20, "log_count": 363}

流量:

ConvertRecord 处理器配置:

JsonTreeReader Controller 配置:

CSVrecordsetWriter Controller 配置:

AvroSchemaRegistry Controller 配置:

{
  "type": "record",
  "name": "demo_schema",
  "fields":
  [
    { "name": "creation_Date", "type": "string"},
    { "name": "Hour_of_day", "type": "string"},
    { "name": "log_count", "type": "string"}
  ]
}

我得到的流文件内容:

creation_Date,Hour_of_day,log_count
2018-08-16,0,3278

我需要什么:

creation_Date,Hour_of_day,log_count
2018-08-16,0,3278
2018-08-17,4,278
2018-08-18,10,6723

希望我详细解释了这种情况，如果有人可以帮助更正配置以便我获得完整的数据，我将不胜感激。提前致谢!

最佳答案

您正面临这个 NIFI-4456 错误并且已修复从 NiFi-1.7 开始 版本。

To work around this issue:

1.使用 SplitText 处理器，split line count =1

2.然后使用MergeContent/MergeRecord 处理器(使用碎片整理作为合并策略)并生成有效的json消息数组

如果您正在使用合并记录处理器，那么读取器和写入器 Controller 服务需要是Json格式。

3.然后将合并关系提供给ConvertRecord处理器。

流量:

从 NiFi-1.7+ 版本开始，我们不需要在 JsonTreeReader Controller 服务中配置任何新的/附加的东西，因为 NiFi 能够读取每行格式的 json 还有。

更新:

MergeContent 配置:

如果我们使用 MergeContent 处理器，请像下面的屏幕截图所示那样配置处理器。

Delimiter Strategy Text

Header [

Footer ]

Demarcator ,

此外，我建议使用 MergeRecord 处理器而不是 MergeContent 处理器，它将负责创建有效的 json 消息数组。

关于hadoop - nifi ConvertRecord JSON 到 CSV 只得到一条记录？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51956767/

hadoop - nifi ConvertRecord JSON 到 CSV 只得到一条记录？

上一篇：java - hadoop mapreduce : where's the final hdfs result file when I speficify multiple reducers?

下一篇：apache-spark - Spark thrift 服务器仅使用 2 个内核