url - 日志存储 : get URL params into hash

标签 url filter logstash

我正在尝试使用 Logstash 和 ElasticSearch 来监视我的 Apache Web 服务器事件。目前,它工作得很好,但我需要有关我的请求字段的更多具体信息。 此时我的logstash配置是:

filter {
  grok { match => { "message" => "%{COMBINEDAPACHELOG}" } }
  grok { match => { "request" => [ "url", "%{URIPATH:url_path}%{URIPARAM:url_params}?" ]} }
   urldecode{ field => "url_path" }
   mutate { gsub =>  ["url_params","\?","" ] }
   kv {
     field_split => "&"
     source => "url_params"
     prefix => "url_param_"
   }
   date { match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] }
   geoip { source => "clientip" }
   useragent { source => "agent" }
 }

获取基本的 apache 日志:

255.254.230.10 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/boreal%3A123456/status.php?pretty=true&test=boreal%3A12345 HTTP/1.1" 200 3891 "http://cadenza/xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"

第一个配置的结果是:

{
         "message" => "255.254.230.10 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/boreal%3A123456/status.php?pretty=true&test=boreal:%3A12345 HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"",
        "@version" => "1",
      "@timestamp" => "2013-12-11T08:01:45.000Z",
            ...
         "request" => "/xampp/boreal%3A123456/status.php?pretty=true&test=boreal%3A12345",
        "url_path" => "/xampp/boreal:123456/status.php",
      "url_params" => "pretty=true&test=boreal%3A12345",
"url_param_pretty" => "true",
  "url_param_test" => "boreal%3A12345",
           ...    
}

并且(在梦想世界中),我希望对 url 参数有这样的响应:

{
         ...
         "request" => "/xampp/boreal%3A123456/status.php?pretty=true&test=boreal%3A12345",
        "url_path" => "/xampp/boreal:123456/status.php",
      "url_params" => {
                "pretty" => "true",
        "url_param_test" => "boreal:12345"
      },
           ...    
}

我的愿望

  • url_params 成为哈希数组。
  • 此哈希的每个键都是参数的名称
  • 每个对应的值将是 urldecode 值

问题

  • 我需要创建自己的插件吗(我还不熟悉 ruby​​)?
  • 是否存在现有插件(我没有找到......可能是错误的搜索)?
  • 这是一种无需插件即可实现的方法吗?

感谢您的帮助(对我的英语感到抱歉)

雷诺

解决方案:

感谢 Val,他找到了解决方案。我将配置更改为:

grok { match => { "request" => [ "url", "%{URIPATH:url_path}%{URIPARAM:url_params}?" ]} }
urldecode{ field => "url_path" }
mutate { gsub =>  ["url_params","\?","" ] }
kv {
  field_split => "&"
  source => "url_params"
  target => "url_params_hash"
}
urldecode{ field => "url_params_hash" }

使用此解决方案,即使 url_params 字符串中包含“&”(%26) 字符,分割也是正确的。

最佳答案

使用 kv 过滤器您几乎可以正确完成此操作。您需要稍微更改其配置。

您还需要在路径的另一个过滤器之后为 url_params 添加另一个 urldecode 过滤器

urldecode{ field => "url_path" }
urldecode{ field => "url_params" }
mutate { gsub =>  ["url_params","\?","" ] }
kv {
  field_split => "&"
  source => "url_params"
  target => "url_params_hash"
}

你会得到这样的结果:

{
        "message" => "255.254.230.10 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/boreal%3A123456/status.php?pretty=true&test=boreal:%3A12345 HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"",
       "@version" => "1",
     "@timestamp" => "2013-12-11T08:01:45.000Z",
"url_params_hash" => {
         "pretty" => "true",
           "test" => "boreal:12345"
     }
}

关于url - 日志存储 : get URL params into hash,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39898841/

相关文章:

php - 我如何使用多个虚 URL?

php - 在网页中显示 URL

matlab - MATLAB中的低通滤波器返回NaN值

elasticsearch - 如何通过 Elasticsearch 将性能测试日志推送到kibana

linux - 我的 bash 脚本在终止命令后不会执行命令

rest - 解析 Web API 路由中的自定义格式 DateTime 值

java - 使用模板方法模式设计过滤器接口(interface)

.net - LINQ - 数组属性包含另一个数组中的元素

date - 两个字段之间的 Kibana 时间增量

javascript - 如何从 url 中提取 &client=?