跳到主要内容

第3章 应用监控

应用监控说明

prometheus的数据指标都是通过http实现的metrics接口获取到的,所以应用只需要暴露metrics接口,prometheus就可以定期的去拉取数据。

随着容器和k8s的流行,现在很多服务都自己内置了metrics接口,对于本身没有提供metrics的应用,promtheus官方也提供了很多可以直接使用的exporter来获取指标数据,比如redis_exporter,mysql_exporter等。

自带/metrics接口的应用检测

k8s里的coredns自带的metrics接口,所以我们可以先拿来试试手,查看croedns的配置文件可以发现提供prometheus服务采集的端口是9153。

[root@node1 prom]# kubectl -n kube-system describe cm coredns
Name: coredns
Namespace: kube-system
Labels: <none>
Annotations: <none>

Data
====
Corefile:
----
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153 #自带的prometheus监控暴露服务
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}

Events: <none>

查看CoreDNS的Pod地址

[root@node1 prom]# kubectl -n kube-system get pod -l k8s-app=kube-dns -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-6d56c8448f-ckwhg 1/1 Running 5 10d 10.2.0.17 node1 <none> <none>
coredns-6d56c8448f-rvmdf 1/1 Running 5 10d 10.2.0.16 node1 <none> <none>

直接访问/metrics接口

[root@node1 prom]# curl -I 10.2.0.17:9153/metrics
HTTP/1.1 200 OK
Content-Type: text/plain; version=0.0.4; charset=utf-8
Date: Wed, 11 Aug 2021 07:14:25 GMT

[root@node1 prom]# curl 10.2.0.17:9153/metrics

# HELP coredns_build_info A metric with a constant '1' value labeled by version, revision, and goversion from which CoreDNS was built.


# TYPE coredns_build_info gauge

coredns_build_info{goversion="go1.14.4",revision="f59c03d",version="1.7.0"} 1

# HELP coredns_cache_entries The number of elements in the cache.


# TYPE coredns_cache_entries gauge

coredns_cache_entries{server="dns://:53",type="denial"} 12
coredns_cache_entries{server="dns://:53",type="success"} 1

# HELP coredns_cache_misses_total The count of cache misses.


# TYPE coredns_cache_misses_total counter

coredns_cache_misses_total{server="dns://:53"} 13

# HELP coredns_dns_request_duration_seconds Histogram of the time (in seconds) each request took.


# TYPE coredns_dns_request_duration_seconds histogram

...........................................

知道了端口,也确认了可以访问,那么接下来我们就可以编辑prometheus的配置文件来发现这个服务了。

编辑prom-cm配置文件

[root@node1 prom]# cat prom-cm.yml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: prom
data:
prometheus.yml: |
global:
scrape_interval: 15s
scrape_timeout: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']

- job_name: 'coredns' #任务名称
static_configs: #静态配置
- targets: ['10.2.0.16:9153','10.2.0.17:9153'] #这里我们直接写coredns的ClusterIP

更新prom-cm资源配置

[root@node1 prom]# kubectl apply -f prom-cm.yml

因为我们在prometheus的配置文件里配置了热更新的参数,所以可以不用重启pod在线热更新配置使其生效。

热更新promtheus配置

[root@node1 prom]# kubectl -n prom get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
prometheus-796566c67c-lhrns 1/1 Running 0 22m 10.2.2.86 node2 <none> <none>

#注意要等一会,因为configmap更新到pod里需要点时间
[root@node1 prom]# curl -X POST "http://10.2.2.86:9090/-/reload"

查看promtheus发现

1726449159545-94a69056-bfaf-4802-ba5c-efe65fd3fcb5.png

使用exporter监控

刚才我们说了,有些应用自带的metrics接口,那么对于没有自带metrics接口的应用,我们可以使用各种exporter监控,官方已经给我们提供了非常多的exporter,具体可以去官网查阅,地址如下:

https://prometheus.io/docs/instrumenting/exporters/

下面以mysql的exporter举例,具体的做法就是在每个mysql的pod里部署一个exporter服务来监控mysql的各项数据。

cat >mysql-prom.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql-dp
spec:
replicas: 1
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:5.7
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3306
env:
- name: MYSQL_ROOT_PASSWORD
value: "123456"
- name: mysql-exporter
image: prom/mysqld-exporter
imagePullPolicy: IfNotPresent
ports:
- containerPort: 9104
env:
- name: DATA_SOURCE_NAME
value: "root:123456@(localhost:3306)/"
---
kind: Service
apiVersion: v1
metadata:
name: mysql-svc
spec:
selector:
app: mysql
ports:
- name: mysql
port: 3306
targetPort: 3306
- name: mysql-prom
port: 9104
targetPort: 9104
EOF

应用后查看

[root@node1 prom]# kubectl apply -f mysql-prom.yaml
deployment.apps/mysql-dp created
service/mysql-svc created

[root@node1 prom]# kubectl get pod
NAME READY STATUS RESTARTS AGE
mysql-dp-79b48cff96-m96bz 2/2 Running 0 73s

[root@node1 prom]# kubectlget svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
mysql-svc ClusterIP 10.1.213.31 <none> 3306/TCP,9104/TCP 4m9s

修改prom配置文件,注意!因为prom和mysql在不同的命名空间,所以prom采集地址的时候需要使用service名称+命名空间

cat > prom-cm.yml << 'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: prom
data:
prometheus.yml: |
global:
scrape_interval: 15s
scrape_timeout: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']

- job_name: 'coredns'
static_configs:
- targets: ['10.2.0.16:9153','10.2.0.17:9153']

- job_name: 'mysql'
static_configs:
- targets: ['mysql-svc.default:9104']
EOF

更新配置

[root@node1 prom]# kubectl apply -f prom-cm.yml
configmap/prometheus-config configured

[root@node1 prom]# curl -X POST "http://10.2.2.86:9090/-/reload"

查看promtheus

更新: 2024-09-21 15:52:29