系统环境:

  • 操作系统: CentOS 7.6
  • Docker 版本: 20.10.8
  • Prometheus 版本: 2.36.0
  • Kubernetes 版本: 1.20.0
  • BlackBox Exporter 版本: 0.21.0

BlackBox Exporter

BlackBox Exporter是Prometheus官方提供的黑盒监控解决方案,允许用户通过HTTP、HTTPS、DNS、TCP以及ICMP的方式对网络进行探测,这种探测方式常常用于探测一个服务的运行状态,观察服务是否正常运行

白盒与黑盒监控

在监控系统中会经常提到白盒监控与黑盒监控两个关键词,对这俩关键词进行一下简单解释

黑盒监控

黑盒监控指的是以用户的身份测试服务的运行状态。常见的黑盒监控手段包括HTTP探针、TCP探针、DNS探测、ICMP等。黑盒监控常用于检测站点与服务可用性、连通性,以及访问效率等

白盒监控

白盒监控一般指的是日常对服务器状态的监控,如服务器资源使用量、容器的运行状态、中间件的稳定情况等一系列比较直观的监控数据,这些都是支撑业务应用稳定运行的基础设施。通过白盒能监控,可以使我们能够了解系统内部的实际运行状况,而且还可以通过对监控指标数据的观察与分析,可以让我们提前预判服务器可能出现的问题,针对可能出现的问题进行及时修正,避免造成不可预估的损失

白盒与黑盒监控的区别

黑盒监控与白盒监控有着很大的不同,俩者的区别主要是,黑盒监控是以故障为主导,当被监控的服务发生故障时,能快速进行预警。而白盒监控则更偏向于主动的和提前预判方式,预测可能发生的故障。
一套完善的监控系统是需要黑盒监控与白盒监控俩者配合同时工作的,白盒监控预判可能存在的潜在问题,而黑盒监控则是快速发现已经发生的问题

k8s部署BlackBox Exporter

创建BlackBox的ConfigMap

为了方便对BlackBox Exporter组件的配置参数进行修改,将其配置文件存入 ConfigMap资源中,其中ConfigMap资源文件blackbox-exporter-config.yaml

[root@k8s01 blackbox]# vim blackbox-exporter-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: blackbox-exporter
  namespace: monitoring
  labels:
    app: blackbox-exporter
data:
  blackbox.yml: |-
    modules:
      ## ----------- DNS 检测配置 -----------
      dns_tcp:  
        prober: dns
        dns:
          transport_protocol: "tcp"
          preferred_ip_protocol: "ip4"
          query_name: "kubernetes.default.svc.cluster.local" # 用于检测域名可用的网址
          query_type: "A" 
      ## ----------- TCP 检测模块配置 -----------
      tcp_connect:
        prober: tcp
        timeout: 5s
      ## ----------- ICMP 检测配置 -----------
      ping:
        prober: icmp
        timeout: 5s
        icmp:
          preferred_ip_protocol: "ip4"
      ## ----------- HTTP GET 2xx 检测模块配置 -----------
      http_get_2xx:  
        prober: http
        timeout: 10s
        http:
          method: GET
          preferred_ip_protocol: "ip4"
          valid_http_versions: ["HTTP/1.1","HTTP/2"]
          valid_status_codes: [200]           # 验证的HTTP状态码,默认为2xx
          no_follow_redirects: false          # 是否不跟随重定向
      ## ----------- HTTP GET 3xx 检测模块配置 -----------
      http_get_3xx:  
        prober: http
        timeout: 10s
        http:
          method: GET
          preferred_ip_protocol: "ip4"
          valid_http_versions: ["HTTP/1.1","HTTP/2"]
          valid_status_codes: [301,302,304,305,306,307]  # 验证的HTTP状态码,默认为2xx
          no_follow_redirects: false                     # 是否不跟随重定向
      ## ----------- HTTP POST 监测模块 -----------
      http_post_2xx: 
        prober: http
        timeout: 10s
        http:
          method: POST
          preferred_ip_protocol: "ip4"
          valid_http_versions: ["HTTP/1.1", "HTTP/2"]
          #headers:                             # HTTP头设置
          #  Content-Type: application/json
          #body: '{}'                           # 请求体设置
          
[root@k8s01 blackbox]# kubectl apply -f blackbox-exporter-config.yaml 
configmap/blackbox-exporter created

参考 BlackBox Exporter 的 Github 提供的 示例配置文件

创建BlackBox Exporter的Deployment

[root@k8s01 blackbox]# vim blackbox-exporter-deploy.yaml 
apiVersion: v1
kind: Service
metadata:
  name: blackbox-exporter
  namespace: monitoring
  labels:
    k8s-app: blackbox-exporter
spec:
  type: ClusterIP
  ports:
  - name: http
    port: 9115
    targetPort: 9115
  selector:
    k8s-app: blackbox-exporter
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: blackbox-exporter
  namespace: monitoring
  labels:
    k8s-app: blackbox-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: blackbox-exporter
  template:
    metadata:
      labels:
        k8s-app: blackbox-exporter
    spec:
      containers:
      - name: blackbox-exporter
        image: prom/blackbox-exporter:v0.21.0
        imagePullPolicy: IfNotPresent
        args:
        - --config.file=/etc/blackbox_exporter/blackbox.yml
        - --web.listen-address=:9115
        - --log.level=info
        ports:
        - name: http
          containerPort: 9115
        resources:
          limits:
            cpu: 200m
            memory: 256Mi
          requests:
            cpu: 100m
            memory: 50Mi
        livenessProbe:
          tcpSocket:
            port: 9115
          initialDelaySeconds: 5
          timeoutSeconds: 5
          periodSeconds: 10
          successThreshold: 1
          failureThreshold: 3
        readinessProbe:
          tcpSocket:
            port: 9115
          initialDelaySeconds: 5
          timeoutSeconds: 5
          periodSeconds: 10
          successThreshold: 1
          failureThreshold: 3
        volumeMounts:
        - name: config
          mountPath: /etc/blackbox_exporter
      volumes:
      - name: config
        configMap:
          name: blackbox-exporter
          defaultMode: 420
          
[root@k8s01 blackbox]# kubectl apply -f blackbox-exporter-deploy.yaml 
service/blackbox-exporter created
deployment.apps/blackbox-exporter created          

查看BlackBox Exporter状态

[root@k8s01 blackbox]# kubectl get -n monitoring po|grep blackbox
blackbox-exporter-5bbdb6b595-fhpxq   1/1     Running   0          63s
[root@k8s01 blackbox]# kubectl get -n monitoring svc|grep blackbox
blackbox-exporter   ClusterIP   10.96.0.25    <none>        9115/TCP 

Prometheus添加探测配置

创建DNS探测配置
创建Prometheus规则,添加使用BlackBox Exporter探测指定DNS服务器健康状态的配置,由于已经基于k8s部署prometheus,且将其配置参数写到ConfigMap资源中,然后通过挂载ConfigMap到Pod内部,这样修改ConfigMap就可以修改Prometheus配置

[root@k8s01 prometheus]# vim prometheus-config.yaml 
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: monitoring
data:
  prometheus.yml: |
    global:
      scrape_interval:     15s
      evaluation_interval: 15s
      external_labels:
        cluster: "kubernetes"
        
    scrape_configs:
...
    ################################## Kubernetes BlackBox DNS ###################################
    - job_name: "kubernetes-dns"
      metrics_path: /probe
      params:
        module: [dns_tcp]
      static_configs:
        - targets:
          - kube-dns.kube-system:53
          - 8.8.4.4:53
          - 8.8.8.8:53
          - 223.5.5.5
      relabel_configs:
        - source_labels: [__address__]
          target_label: __param_target
        - source_labels: [__param_target]
          target_label: instance
        - target_label: __address__
          replacement: blackbox-exporter.monitoring:9115

上述参数解释:

################ DNS 服务器监控 ###################
- job_name: "kubernetes-dns"
  metrics_path: /probe
  params:
    ## 配置要使用的模块,要与blackbox exporter配置中的一致
    ## 这里使用DNS模块
    module: [dns_tcp]
  static_configs:
    ## 配置要检测的地址
    - targets:
      - kube-dns.kube-system:53
      - 8.8.4.4:53
      - 8.8.8.8:53
      - 223.5.5.5
  relabel_configs:
    ## 将上面配置的静态DNS服务器地址转换为临时变量 “__param_target”
    - source_labels: [__address__]
      target_label: __param_target
    ## 将 “__param_target” 内容设置为 instance 实例名称
    - source_labels: [__param_target]
      target_label: instance
    ## BlackBox Exporter 的 Service 地址
    - target_label: __address__
      replacement: blackbox-exporter.monitoring:9115
################ DNS 服务器监控 ###################
- job_name: "kubernetes-dns"
  metrics_path: /probe
  params:
    ## 配置要使用的模块,要与blackbox exporter配置中的一致
    ## 这里使用DNS模块
    module: [dns_tcp]
  static_configs:
    ## 配置要检测的地址
    - targets:
      - kube-dns.kube-system:53
      - 8.8.4.4:53
      - 8.8.8.8:53
      - 223.5.5.5
  relabel_configs:
    ## 将上面配置的静态DNS服务器地址转换为临时变量 “__param_target”
    - source_labels: [__address__]
      target_label: __param_target
    ## 将 “__param_target” 内容设置为 instance 实例名称
    - source_labels: [__param_target]
      target_label: instance
    ## BlackBox Exporter 的 Service 地址
    - target_label: __address__
      replacement: blackbox-exporter.monitoring:9115

特别注意BlackBox Exporter的Service地址,根据blackbox部署的namespace来更改

创建Service探测配置

创建用于探测Kubernetes服务的配置,对那些配置了prometheus.io/http-probe: “true” 标签的Kubernetes Service资源的健康状态进行探测,由于已经基于k8s部署prometheus,且将其配置参数写到ConfigMap资源中,然后通过挂载ConfigMap到Pod内部,这样修改ConfigMap就可以修改Prometheus配置

[root@k8s01 prometheus]# vim prometheus-config.yaml 
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: monitoring
data:
  prometheus.yml: |
    global:
      scrape_interval:     15s
      evaluation_interval: 15s
      external_labels:
        cluster: "kubernetes"
        
    scrape_configs:
...
    ################################## Kubernetes BlackBox Services ###################################
    - job_name: 'kubernetes-services'
      metrics_path: /probe
      params:
        module:
        - "http_get_2xx"
        - "http_get_3xx"
      kubernetes_sd_configs:
      - role: service
      relabel_configs:
      - action: keep
        source_labels: [__meta_kubernetes_service_annotation_prometheus_io_http_probe]
        regex: "true"
      - action: replace
        source_labels: 
        - "__meta_kubernetes_service_name"
        - "__meta_kubernetes_namespace"
        - "__meta_kubernetes_service_annotation_prometheus_io_http_probe_port"
        - "__meta_kubernetes_service_annotation_prometheus_io_http_probe_path"
        target_label: __param_target
        regex: (.+);(.+);(.+);(.+)
        replacement: $1.$2:$3$4
      - target_label: __address__
        replacement: blackbox-exporter.monitoring:9115
      - source_labels: [__param_target]
        target_label: instance
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        target_label: kubernetes_name    

上述参数解释

- job_name: "kubernetes-services"
  metrics_path: /probe
  ## 使用HTTP_GET_2xx与HTTP_GET_3XX模块
  params: 
    module:
    - "http_get_2xx"
    - "http_get_3xx"
  ## 使用Kubernetes动态服务发现,且使用Service类型的发现
  kubernetes_sd_configs:
  - role: service
  relabel_configs:
    ## 设置只监测Kubernetes Service中Annotation里配置了注解prometheus.io/http_probe: true的service
  - action: keep
    source_labels: [__meta_kubernetes_service_annotation_prometheus_io_http_probe]
    regex: "true"
  - action: replace
    source_labels: 
    - "__meta_kubernetes_service_name"
    - "__meta_kubernetes_namespace"
    - "__meta_kubernetes_service_annotation_prometheus_io_http_probe_port"
    - "__meta_kubernetes_service_annotation_prometheus_io_http_probe_path"
    target_label: __param_target
    regex: (.+);(.+);(.+);(.+)
    replacement: $1.$2:$3$4
  ## BlackBox Exporter 的 Service 地址
  - target_label: __address__
    replacement: blackbox-exporter.monitoring:9115
  - source_labels: [__param_target]
    target_label: instance
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    target_label: kubernetes_name

特别注意BlackBox Exporter的Service地址,根据blackbox部署的namespace来更改

更新Prometheus的ConfigMap

[root@k8s01 prometheus]# kubectl apply -f prometheus-config.yaml 
configmap/prometheus-config configured

Prometheus重新加载配置

[root@k8s01 prometheus]# curl -XPOST http://10.105.x.x:30089/-/reload

Prometheus监控探测k8s应用

本文使用k8s部署nginx应用做测试,部署nginx的nginx-deploy.yaml内容如下:

[root@k8s01 ~]# vim nginx-deploy.yaml 
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    k8s-app: nginx
  annotations:
    prometheus.io/http-probe: "true"        ### 设置该服务执行HTTP探测
    prometheus.io/http-probe-port: "80"     ### 设置HTTP探测的接口
    prometheus.io/http-probe-path: "/"      ### 设置HTTP探测的地址
spec:
  type: ClusterIP
  ports:
  - name: http
    port: 80
    targetPort: 80
  selector:
    app: nginx
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec: 
      containers:
      - name: nginx
        image: nginx:1.19
        ports:
        - containerPort: 80
        
[root@k8s01 ~]# kubectl apply -f nginx-deploy.yaml 
service/nginx created
deployment.apps/nginx created

查看Prometheus UI界面

访问Prometheus的UI界面,进入Status中的Targets界面查看:
k8s-black1
可以看到Prometheus已经按照配置的DNS服务器地址列表,执行定时探测DNS服务器 的健康状况,除此之外Prometheus还会定期探测那些添加了特定注解的,存在于k8s集群中的Service资源的健康状态

Grafana引BlackBox Exporter监控模板

登入Grafana界面,点击Grafana左侧栏菜单,选择Manage菜单,进入后点击右上角 Import按钮,设置Import的ID号为9965,引入BlackBox Exporter模板,然后点击Load按钮进入配置数据库,选择使用Prometheus数据库,之后点击Import按钮进入看板:
k8s-black2

文章作者: 鲜花的主人
版权声明: 本站所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 爱吃可爱多
监控服务 Prometheus Kubernetes Kubernetes 监控服务 Prometheus
喜欢就支持一下吧
打赏
微信 微信
支付宝 支付宝