json - Datadog Grok 解析 - 从嵌套的 JSON 中提取字段

标签 json logging datadog

是否可以提取嵌套在日志中的 json 字段?

我一直在研究的示例:

thread-191555 app.main - [cid: 2cacd6f9-546d-41ew-a7ce-d5d41b39eb8f, uid: e6ffc3b0-2f39-44f7-85b6-1abf5f9ad970] Request: protocol=[HTTP/1.0] method=[POST] path=[/metrics] headers=[Timeout-Access: <function1>, Remote-Address: 192.168.0.1:37936, Host: app:5000, Connection: close, X-Real-Ip: 192.168.1.1, X-Forwarded-For: 192.168.1.1, Authorization: ***, Accept: application/json, text/plain, */*, Referer: https://google.com, Accept-Language: cs-CZ, Accept-Encoding: gzip, deflate, User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko, Cache-Control: no-cache] entity=[HttpEntity.Strict application/json {"type":"text","extract": "text", "field2":"text2","duration": 451 }

我想要实现的是:
{
"extract": "text",
"duration": "451"
}

我尝试将示例正则表达式 ( "(extract)"\s*:\s*"([^"]+)",? ) 与 example_parser %{data::json} 结合起来(对于初学者,使用 JSON 作为日志示例数据)但我没有设法使任何工作。

提前致谢!

最佳答案

该示例文本的格式是否正确?最终的实体对象缺少 ]从最后。
entity=[HttpEntity.Strict application/json {"type":"text","extract": "text", "field2":"text2","duration": 451 }
应该
entity=[HttpEntity.Strict application/json {"type":"text","extract": "text", "field2":"text2","duration": 451 }]
假设这是一个错字,并且实体字段实际上以 ] 结尾,我将继续这些说明。 .如果没有,我认为您需要修复基础日志以正确格式化并关闭括号。

与其只是跳过整个日志并仅解析出那个 json 位,我决定解析整个内容并展示看起来不错的最终结果。所以我们需要做的第一件事就是在请求对象之后取出那组键/值对:

示例输入:thread-191555 app.main - [cid: 2cacd6f9-546d-41ew-a7ce-d5d41b39eb8f, uid: e6ffc3b0-2f39-44f7-85b6-1abf5f9ad970] Request: protocol=[HTTP/1.0] method=[POST] path=[/metrics] headers=[Timeout-Access: <function1>, Remote-Address: 192.168.0.1:37936, Host: app:5000, Connection: close, X-Real-Ip: 192.168.1.1, X-Forwarded-For: 192.168.1.1, Authorization: ***, Accept: application/json, text/plain, */*, Referer: https://google.com, Accept-Language: cs-CZ, Accept-Encoding: gzip, deflate, User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko, Cache-Control: no-cache] entity=[HttpEntity.Strict application/json {"type":"text","extract": "text", "field2":"text2","duration": 451 }]
Grok 解析器规则:app_log thread-%{integer:thread} %{notSpace:file} - \[%{data::keyvalue(": ")}\] Request: %{data:request:keyvalue("=","","[]")}
结果:

{
  "thread": 191555,
  "file": "app.main",
  "cid": "2cacd6f9-546d-41ew-a7ce-d5d41b39eb8f",
  "uid": "e6ffc3b0-2f39-44f7-85b6-1abf5f9ad970",
  "request": {
    "protocol": "HTTP/1.0",
    "method": "POST",
    "path": "/metrics",
    "headers": "Timeout-Access: <function1>, Remote-Address: 192.168.0.1:37936, Host: app:5000, Connection: close, X-Real-Ip: 192.168.1.1, X-Forwarded-For: 192.168.1.1, Authorization: ***, Accept: application/json, text/plain, */*, Referer: https://google.com, Accept-Language: cs-CZ, Accept-Encoding: gzip, deflate, User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko, Cache-Control: no-cache",
    "entity": "HttpEntity.Strict application/json {\"type\":\"text\",\"extract\": \"text\", \"field2\":\"text2\",\"duration\": 451 }"
  }
}

app log parser

请注意我们如何使用带有 [] 引用字符串的键值解析器。 ,这使我们可以轻松地从请求对象中提取所有内容。

现在的目标是从请求对象内的实体字段中提取详细信息。使用 Grok 解析器,您可以指定要进一步解析的特定属性。

所以在同一个管道中,我们将在第一个之后添加另一个 grok 解析器处理器

enter image description here

然后将高级选项部分配置为在 request.entity 上运行,因为这就是我们所说的属性

enter image description here

示例输入:HttpEntity.Strict application/json {"type":"text","extract": "text", "field2":"text2","duration": 451 }
Grok 解析器规则:entity_rule %{notSpace:request.entity.class} %{notSpace:request.entity.media_type} %{data:request.entity.json:json}
结果:
{
  "request": {
    "entity": {
      "class": "HttpEntity.Strict",
      "media_type": "application/json",
      "json": {
        "duration": 451,
        "extract": "text",
        "type": "text",
        "field2": "text2"
      }
    }
  }
}

现在,当我们查看最终解析的日志时,它包含了我们需要分解的所有内容:

enter image description here

也只是因为它真的很简单,我还为标题块添加了第三个 grok 处理器(高级设置设置为从 request.headers 解析):

示例输入:Timeout-Access: <function1>, Remote-Address: 192.168.0.1:37936, Host: app:5000, Connection: close, X-Real-Ip: 192.168.1.1, X-Forwarded-For: 192.168.1.1, Authorization: ***, Accept: application/json, text/plain, */*, Referer: https://google.com, Accept-Language: cs-CZ, Accept-Encoding: gzip, deflate, User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko, Cache-Control: no-cache
Grok 解析器规则:headers_rule %{data:request.headers:keyvalue(": ", "/)(; :")}
结果:
{
  "request": {
    "headers": {
      "Timeout-Access": "function1",
      "Remote-Address": "192.168.0.1:37936",
      "Host": "app:5000",
      "Connection": "close",
      "X-Real-Ip": "192.168.1.1",
      "X-Forwarded-For": "192.168.1.1",
      "Accept": "application/json",
      "Referer": "https://google.com",
      "Accept-Language": "cs-CZ",
      "Accept-Encoding": "gzip",
      "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko",
      "Cache-Control": "no-cache"
    }
  }
}

这里唯一棘手的一点是我必须定义一个 /)(; : 的 characterWhiteList .主要处理所有这些特殊字符在 User-Agent 中。 field 。

引用文献 :

只是文档和一些猜测并检查我的个人 Datadog 帐户。

https://docs.datadoghq.com/logs/processing/parsing/?tab=matcher#key-value-or-logfmt

关于json - Datadog Grok 解析 - 从嵌套的 JSON 中提取字段,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62092243/

相关文章:

c# - 有没有更简单的方法在 android 应用程序和服务器数据库之间传输数据?

java - 带有@JsonIgnore 的属性和没有注释的属性有什么区别?

python - 在 Python 中临时完全禁用日志记录

android - 我应该如何在 C 中连接宏?

java - DataDog api - 发送堆栈跟踪

python - 我可以向父跨度 Datadog 添加标签吗

javascript - 将平面 JSON 转换为父子关系结构

jquery - JavaScript : How can i get a key value inside an array of a JSON object

kubernetes-helm - 使用 helm 和 values.yaml 部署 Datadog DaemonSet + 集群代理时如何包含集成指标?

linux - Cron 压缩文件