我试图让 Fluentd 解析来自 Docker 日志记录驱动程序的 Java 堆栈跟踪,使用 in_tail 并将它们作为单个消息发出。
对于我的生活,无法弄清楚为什么它仍然将它们分开。
这是一个示例输入,正在写入文件:
2015-12-17T19:19:47+00:00 docker.java.ubuntu:15.10 {"log":"Exception in thread main java.lang.NullPointerException\r","container_id":"5a064eb23465350a11fe00b1f7787f5bd3e9f0182dd44c09516a72ab4006bd54","container_name":"/src-test_1.0.0.353_989549167.1","source":"stdout"}
2015-12-17T19:19:47+00:00 docker.java.ubuntu:15.10 {"container_id":"5a064eb23465350a11fe00b1f7787f5bd3e9f0182dd44c09516a72ab4006bd54","container_name":"/src-test_1.0.0.353_989549167.1","source":"stdout","log":" at com.example.myproject.Book.getTitle(Book.java:16)\r"}
2015-12-17T19:19:47+00:00 docker.java.ubuntu:15.10 {"container_name":"/src-test_1.0.0.353_989549167.1","source":"stdout","log":" at com.example.myproject.Author.getBookTitles(Author.java:25)\r","container_id":"5a064eb23465350a11fe00b1f7787f5bd3e9f0182dd44c09516a72ab4006bd54"}
2015-12-17T19:19:47+00:00 docker.java.ubuntu:15.10 {"container_id":"5a064eb23465350a11fe00b1f7787f5bd3e9f0182dd44c09516a72ab4006bd54","container_name":"/src-test_1.0.0.353_989549167.1","source":"stdout","log":" at com.example.myproject.Bootstrap.main(Bootstrap.java:14)\r"}
2015-12-17T19:19:47+00:00 docker.java.ubuntu:15.10 {"container_id":"5a064eb23465350a11fe00b1f7787f5bd3e9f0182dd44c09516a72ab4006bd54","container_name":"/src-test_1.0.0.353_989549167.1","source":"stdout","log":"test\r"}
这是我用于 in_tail 的配置:
<source>
@type tail
tag docker.multiline
path /tmp/fluent/java*
pos_file /tmp/fluent/log.pos
refresh_interval 10
format multiline
format first_line /.*\"log\":\"[^\s].*/
format /\"log\":\"(?<message>.+)\\r/
</source>
正则表达式对我来说看起来是正确的,当我将它们插入正则表达式测试器时,first_line 正则表达式只匹配我样本的第一行和最后一行,而格式正则表达式匹配每一行,但只捕获堆栈跟踪信息,如我期待着。然而,它们都作为单独的消息出现,几乎就像 first_line 匹配每一行,而不是第一行和最后一行。
最佳答案
根据 https://docs.fluentd.org/v0.12/articles/parser_multiline ,配置键应该是 format_firSTLine
和 format
(而不是 format first_line
和 format
)。
关于regex - Fluentd 从 Docker 捕获堆栈跟踪,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34342820/