问题:我在 Kubernetes 集群之外有一个 Prometheus。所以,我想从远程集群导出指标。
我从 Prometheus Github repo 中获取了配置示例并稍微修改了一下。所以,这是我的工作配置。
- job_name: 'kubernetes-apiservers'
scheme: http
kubernetes_sd_configs:
- role: endpoints
api_server: http://cluster-manager.dev.example.net:8080
bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;http
- job_name: 'kubernetes-nodes'
scheme: http
kubernetes_sd_configs:
- role: node
api_server: http://cluster-manager.dev.example.net:8080
bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
tls_config:
insecure_skip_verify: true
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubernetes-service-endpoints'
scheme: http
kubernetes_sd_configs:
- role: endpoints
api_server: http://cluster-manager.dev.example.net:8080
bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (http?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: (.+)(?::\d+);(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
- job_name: 'kubernetes-services'
scheme: http
metrics_path: /probe
params:
module: [http_2xx]
kubernetes_sd_configs:
- role: service
api_server: http://cluster-manager.dev.example.net:8080
bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_service_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: kubernetes_name
- job_name: 'kubernetes-pods'
scheme: http
kubernetes_sd_configs:
- role: pod
api_server: http://cluster-manager.dev.example.net:8080
bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: (.+):(?:\d+);(\d+)
replacement: ${1}:${2}
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
我不使用 TLS 连接到 API,所以我想禁用它。
当我 curl 时
/metrics
来自 Prometheus 主机的 URL - 它会打印它们。最后我连接到集群,但是......作业没有启动,因此 Prometheus 不会公开重新标记的指标。
我在控制台中看到的。
目标状态:
我还检查了 Prometheus 调试。人们认为系统获得了任何必要的信息并且请求已成功完成。
time="2017-01-25T06:58:04Z" level=debug msg="pod update" kubernetes_sd=pod source="pod.go:66" tg="&config.TargetGroup{Targets:[]model.LabelSet{model.LabelSet{\"__meta_kubernetes_pod_container_port_protocol\":\"UDP\", \"__address__\":\"10.32.0.2:10053\", \"__meta_kubernetes_pod_container_name\":\"kube-dns\", \"__meta_kubernetes_pod_container_port_number\":\"10053\", \"__meta_kubernetes_pod_container_port_name\":\"dns-local\"}, model.LabelSet{\"__address__\":\"10.32.0.2:10053\", \"__meta_kubernetes_pod_container_name\":\"kube-dns\", \"__meta_kubernetes_pod_container_port_number\":\"10053\", \"__meta_kubernetes_pod_container_port_name\":\"dns-tcp-local\", \"__meta_kubernetes_pod_container_port_protocol\":\"TCP\"}, model.LabelSet{\"__meta_kubernetes_pod_container_name\":\"kube-dns\", \"__meta_kubernetes_pod_container_port_number\":\"10055\", \"__meta_kubernetes_pod_container_port_name\":\"metrics\", \"__meta_kubernetes_pod_container_port_protocol\":\"TCP\", \"__address__\":\"10.32.0.2:10055\"}, model.LabelSet{\"__address__\":\"10.32.0.2:53\", \"__meta_kubernetes_pod_container_name\":\"dnsmasq\", \"__meta_kubernetes_pod_container_port_number\":\"53\", \"__meta_kubernetes_pod_container_port_name\":\"dns\", \"__meta_kubernetes_pod_container_port_protocol\":\"UDP\"}, model.LabelSet{\"__address__\":\"10.32.0.2:53\", \"__meta_kubernetes_pod_container_name\":\"dnsmasq\", \"__meta_kubernetes_pod_container_port_number\":\"53\", \"__meta_kubernetes_pod_container_port_name\":\"dns-tcp\", \"__meta_kubernetes_pod_container_port_protocol\":\"TCP\"}, model.LabelSet{\"__meta_kubernetes_pod_container_port_number\":\"10054\", \"__meta_kubernetes_pod_container_port_name\":\"metrics\", \"__meta_kubernetes_pod_container_port_protocol\":\"TCP\", \"__address__\":\"10.32.0.2:10054\", \"__meta_kubernetes_pod_container_name\":\"dnsmasq-metrics\"}, model.LabelSet{\"__meta_kubernetes_pod_container_port_protocol\":\"TCP\", \"__address__\":\"10.32.0.2:8080\", \"__meta_kubernetes_pod_container_name\":\"healthz\", \"__meta_kubernetes_pod_container_port_number\":\"8080\", \"__meta_kubernetes_pod_container_port_name\":\"\"}}, Labels:model.LabelSet{\"__meta_kubernetes_pod_ready\":\"true\", \"__meta_kubernetes_pod_annotation_kubernetes_io_created_by\":\"{\\\"kind\\\":\\\"SerializedReference\\\",\\\"apiVersion\\\":\\\"v1\\\",\\\"reference\\\":{\\\"kind\\\":\\\"ReplicaSet\\\",\\\"namespace\\\":\\\"kube-system\\\",\\\"name\\\":\\\"kube-dns-2924299975\\\",\\\"uid\\\":\\\"fa808d95-d7d9-11e6-9ac9-02dfdae1a1e9\\\",\\\"apiVersion\\\":\\\"extensions\\\",\\\"resourceVersion\\\":\\\"89\\\"}}\\n\", \"__meta_kubernetes_pod_annotation_scheduler_alpha_kubernetes_io_affinity\":\"{\\\"nodeAffinity\\\":{\\\"requiredDuringSchedulingIgnoredDuringExecution\\\":{\\\"nodeSelectorTerms\\\":[{\\\"matchExpressions\\\":[{\\\"key\\\":\\\"beta.kubernetes.io/arch\\\",\\\"operator\\\":\\\"In\\\",\\\"values\\\":[\\\"amd64\\\"]}]}]}}}\", \"__meta_kubernetes_pod_name\":\"kube-dns-2924299975-dksg5\", \"__meta_kubernetes_pod_ip\":\"10.32.0.2\", \"__meta_kubernetes_pod_label_k8s_app\":\"kube-dns\", \"__meta_kubernetes_pod_label_pod_template_hash\":\"2924299975\", \"__meta_kubernetes_pod_label_tier\":\"node\", \"__meta_kubernetes_pod_annotation_scheduler_alpha_kubernetes_io_tolerations\":\"[{\\\"key\\\":\\\"dedicated\\\",\\\"value\\\":\\\"master\\\",\\\"effect\\\":\\\"NoSchedule\\\"}]\", \"__meta_kubernetes_namespace\":\"kube-system\", \"__meta_kubernetes_pod_node_name\":\"cluster-manager.dev.example.net\", \"__meta_kubernetes_pod_label_component\":\"kube-dns\", \"__meta_kubernetes_pod_label_kubernetes_io_cluster_service\":\"true\", \"__meta_kubernetes_pod_host_ip\":\"54.194.166.39\", \"__meta_kubernetes_pod_label_name\":\"kube-dns\"}, Source:\"pod/kube-system/kube-dns-2924299975-dksg5\"}"
time="2017-01-25T06:58:04Z" level=debug msg="pod update" kubernetes_sd=pod source="pod.go:66" tg="&config.TargetGroup{Targets:[]model.LabelSet{model.LabelSet{\"__address__\":\"10.43.0.0\", \"__meta_kubernetes_pod_container_name\":\"bot\"}}, Labels:model.LabelSet{\"__meta_kubernetes_pod_host_ip\":\"172.17.101.25\", \"__meta_kubernetes_pod_label_app\":\"bot\", \"__meta_kubernetes_namespace\":\"default\", \"__meta_kubernetes_pod_name\":\"bot-272181271-pnzsz\", \"__meta_kubernetes_pod_ip\":\"10.43.0.0\", \"__meta_kubernetes_pod_node_name\":\"ip-172-17-101-25\", \"__meta_kubernetes_pod_annotation_kubernetes_io_created_by\":\"{\\\"kind\\\":\\\"SerializedReference\\\",\\\"apiVersion\\\":\\\"v1\\\",\\\"reference\\\":{\\\"kind\\\":\\\"ReplicaSet\\\",\\\"namespace\\\":\\\"default\\\",\\\"name\\\":\\\"bot-272181271\\\",\\\"uid\\\":\\\"c297b3c2-e15d-11e6-a28a-02dfdae1a1e9\\\",\\\"apiVersion\\\":\\\"extensions\\\",\\\"resourceVersion\\\":\\\"1465127\\\"}}\\n\", \"__meta_kubernetes_pod_ready\":\"true\", \"__meta_kubernetes_pod_label_pod_template_hash\":\"272181271\", \"__meta_kubernetes_pod_label_version\":\"v0.1\"}, Source:\"pod/default/bot-272181271-pnzsz\"}"
Prometheus 获取更新,但...不会将它们转换为指标。
所以,我已经打破了我的大脑来弄清楚为什么它会这样。所以,如果你能找出可能出错的地方,请帮忙。
最佳答案
如果您想从外部 Prometheus 服务器监控 Kubernetes 集群,我建议设置一个 Prometheus federation拓扑:
这是可扩展的解决方案。您可以根据需要添加监控任意数量的 K8s 集群,直到达到中心 Prometheus 的容量。然后你可以添加另一个中心 Prometheus 实例来监控其他实例。
关于kubernetes - Prometheus:无法从连接的 Kubernetes 集群导出指标,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41845307/