regex - 自定义grok模式-匹配多个模式

标签 regex elasticsearch logstash kibana logstash-grok

我之前曾问过类似的问题,但没有任何回应,因此现在是时候改写这个词了,希望能得到一些急需的帮助。

最终,我想创建一个摄取管道,但是在尝试使用Kibana中的Grok调试器创建自定义grok模式时,我遇到了第一个障碍,从消息中提取了两个字段。随着以下消息:

This is a document with a lengthy text it contains a number of paragraphs and at the end I'll add some markers that indicate additional information I'd like to pull out and add as additional fields. This is the end of the actual document with additional information being added prior to the closing bracket of the RTF.

additionalfield1: this is information associated with additionalfield1

additionalfield2: information associated with additionalfield2



我正在尝试创建以下字段,但似乎无法使两个模式都匹配,只有一个或另一个要匹配。
{
  "additionalfield1": ": this is information associated with additionalfield1",
  "additionalfield2": ": this is information associated with additionalfield2"

}

下图显示了匹配单个模式时我正在做的事情,我希望学习如何匹配和提取以上两者。从屏幕快照中可以看到,匹配其中一个,在这种情况下,“additionalfield1”效果很好,如果我更改模式,同样如此,但是如果我尝试查找两者,我什么也没有返回。

Grok Debugger single pattern matching

下面的屏幕截图显示了尝试提取extrafield1和additional2(如果同时存在)的尝试失败,在这种情况下,它仅提取了additional2。

enter image description here

任何帮助将非常感激。

更新:

很明显,我真的一点都不明白。该文本显然包含许多换行符,但是如果我使用的是
(?m)%{FINCLASS:finclass}

我正在提取extrafield1

如果那时我要补充
(?m)%{FINCLASS:finclass}(?m)%{MYCLASS:myclass}

然后在自定义模式下输入:
FINCLASS : (?<=additionalfield1:\s)[^,\n]*
MYCLASS : (?<=additionalfield2:\s)[^,\n]*

我收到一条消息,指出模式不匹配,但在extrafield1之后,该行的其余部分是换行符,因此Additionalfield2始终跟随该\ n

这使我发疯,因此,如果您想启发一个菜鸟,请不要把头发弄乱。

最佳答案

试试这个:

输入:

This is a document with a lengthy text it contains a number of paragraphs and at the end I'll add some markers that indicate additional information I'd like to pull out and add as additional fields. This is the end of the actual document with additional information being added prior to the closing bracket of the RTF.

additionalfield1: this is information associated with additionalfield1

additionalfield2: information associated with additionalfield2

GROK模式:
additionalfield1: (?<additionalfield1>([^,]*))additionalfield2: (?<additionalfield2>([^,]*))

输出:
{
  "additionalfield1": [
    [
      "this is information associated with additionalfield1\n\n"
    ]
  ],
  "additionalfield2": [
    [
      "information associated with additionalfield2"
    ]
  ]
}

关于regex - 自定义grok模式-匹配多个模式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59047054/

相关文章:

java - 在 java 中使用正则表达式过滤日志

python - 如何匹配其中包含正则表达式语法的字符串?

c# - 将文档升级到ES需要多长时间才能正确搜索文档?

elasticsearch - Logstash-从日志中添加字段-Grok

javascript - jquery 或 javascript 正则表达式模式

javascript - 处理多个 JavaScript 替换匹配项

mysql - 从 MYSQL 迁移到 Elasticsearch 的最佳方式是什么?

java - 使用 Java api 的 Elasticsearch 聚合

elasticsearch - 安装ELK后,Ubuntu服务器的CPU使用率迅速提高

logstash - 如何设置logstash将日志转发到另一个logstash?