kubernetes - 如何使用 Prometheus Operator 将 PrometheusRule 与 AlertmanagerConfig 匹配

标签 kubernetes prometheus prometheus-alertmanager prometheus-operator

我有多个prometheusRules(rule a, rule b) ,并且每个规则定义了不同的exp来约束警报;那么,我有不同的AlertmanagerConfig (一个接收器是松弛的,那么另一个接收器是 opsgenie );我们如何在规则和alertmanagerconfig之间建立联系?例如:如果规则a被触发,我想发送消息到slack;如果规则b被触发,我想发送消息到opsgenie .

这是我尝试过的方法,但是不起作用。我错过了什么吗?

这是prometheuisRule文件

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    prometheus: service-prometheus
    role: alert-rules
    app: kube-prometheus-stack
    release: monitoring-prom
  name: rule_a
  namespace: monitoring
spec:
  groups:
    - name: rule_a_alert
      rules:
        - alert: usage_exceed
          expr: salesforce_api_usage > 100000
          labels:
            severity: urgent

这是alertManagerConfig

apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  labels:
    alertmanagerConfig: slack
  name: slack
  namespace: monitoring
  resourceVersion: "25842935"
  selfLink: /apis/monitoring.coreos.com/v1alpha1/namespaces/monitoring/alertmanagerconfigs/opsgenie-and-slack
  uid: fbb74924-5186-4929-b363-8c056e401921
spec:
  receivers:
  - name: slack-receiver
    slackConfigs:
    - apiURL:
        key: apiURL
        name: slack-config
  route:
    groupBy:
    - job
    groupInterval: 60s
    groupWait: 60s
    receiver: slack-receiver
    repeatInterval: 1m
    routes:
    - matchers:
      - name: job
        value: service_a
      receiver: slack-receiver

最佳答案

您需要匹配警报的标签,在您的情况下,您尝试将标签 job 与不存在的值 service_a 进行匹配。您可以通过更改alertManagerConfig文件中的match来匹配prometheuisRule文件中确实存在的标签,例如severity:

route:
  routes:
  - match:
      severity: urgent
    receiver: slack-receiver

或者您可以向 prometheuisRule 文件添加另一个标签:

spec:
  groups:
    - name: rule_a_alert
      rules:
        - alert: usage_exceed
          expr: salesforce_api_usage > 100000
          labels:
            severity: urgent
            job: service_a

关于kubernetes - 如何使用 Prometheus Operator 将 PrometheusRule 与 AlertmanagerConfig 匹配,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66663856/

相关文章:

postgresql - Flask Postgres 与 Kubernetes 和 Docker 的连接失败

azure - k8s部署minio,但无法访问web控制台页面

Kubernetes、安全上下文、fsGroup 字段和运行容器的默认用户组 ID

prometheus - 普罗米修斯配置中面临的问题

monitoring - Prometheus-Alertmanager 警报的复杂规则/过滤器

kubernetes - 在AWS Kubernetes上安装 super 集后无法登录

prometheus - 如何为外部prometheus服务器安装和配置kube-state-metrics来监控kubernetes

kubernetes - 集群范围内 API 组中的禁止资源

prometheus-alertmanager - 使用curl在alertmanager上创建警报

kubernetes - helm prometheus operator - 设置电子邮件通知/编辑 secret