第6章 Grafana
grafana介绍
安装部署
cat > grafana.yml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: prom
spec:
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
volumes:
- name: storage
hostPath:
path: /data/k8s/grafana/
nodeSelector:
kubernetes.io/hostname: node2
securityContext:
runAsUser: 0
containers:
- name: grafana
image: grafana/grafana:7.4.3
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3000
name: grafana
env:
- name: GF_SECURITY_ADMIN_USER
value: admin
- name: GF_SECURITY_ADMIN_PASSWORD
value: admin
readinessProbe:
failureThreshold: 10
httpGet:
path: /api/health
port: 3000
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 30
livenessProbe:
failureThreshold: 3
httpGet:
path: /api/health
port: 3000
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 3
resources:
limits:
cpu: 150m
memory: 512Mi
requests:
cpu: 150m
memory: 512Mi
volumeMounts:
- mountPath: /var/lib/grafana
name: storage
---
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: prom
spec:
ports:
- port: 3000
selector:
app: grafana
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: grafana
namespace: prom
labels:
app: grafana
spec:
ingressClassName: nginx
rules:
- host: grafana.k8s.com
http:
paths:
- path: /
pathType: ImplementationSpecific
backend:
service:
name: grafana
port:
number: 3000
EOF
应用资源配置:
[root@node1 prom]# kubectl apply -f grafana.yml
deployment.apps/grafana created
service/grafana created
ingress.networking.k8s.io/prometheus configured
访问测试:
添加数据源
安装插件
grafana具有丰富的插件,这里我们使用一个非常强大的专门对k8s集群进行监控的插件 :
DevOpsProdigy KubeGraf 项目地址为:
https://github.com/devopsprodigy/kubegraf/
https://github.com/devopsprodigy/kubegraf-v2
安装这个插件需要我们进入grafana的pod内进行安装:
[root@node1 prom]# kubectl -n prom exec -it grafana-7f5b7455fc-z6ctx -- /bin/bash
bash-5.0# grafana-cli plugins install devopsprodigy-kubegraf-app
installing devopsprodigy-kubegraf-app @ 1.5.2
from: https://grafana.com/api/plugins/devopsprodigy-kubegraf-app/versions/1.5.2/download
into: /var/lib/grafana/plugins
✔ Installed devopsprodigy-kubegraf-app successfully
installing grafana-piechart-panel @ 1.6.2
from: https://grafana.com/api/plugins/grafana-piechart-panel/versions/1.6.2/download
into: /var/lib/grafana/plugins
✔ Installed grafana-piechart-panel successfully
Installed dependency: grafana-piechart-panel ✔
Restart grafana after installing plugins . <service grafana-server restart>
bash-5.0#
安装完成后我们还需要重启一下grafana才能生效,因为我们做了数据持久化,所以直接删除pod重新创建即可。
[root@node1 prom]# kubectl -n prom delete pod grafana-7f5b7455fc-z6ctx
pod "grafana-7f5b7455fc-z6ctx" deleted
重启之后我们在grafana页面激活插件
这里需要对验证,我们使用kubectl的kubeconfig配置文件的内容来进行配置:
[root@node1 prom]# cat ~/.kube/config
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: #CA Cert的值
..............
kind: Config
preferences: {}
users:
- name: kubernetes-admin
user:
client-certificate-data: #Client Cert的值
client-key-data: #Client Key的值
..............
但是配置文件里的为base64编码后的,所以我们还需要进行解码,配置完成后的截图如下:干!你写完整能!!@#!@%#
保存之后左边就会出现插件的图标,点击就可以查看了
导入dashboard
当我们下载别人的dashboard时经常会遇到图形显示错乱或者数据异常,这是因为作者制作的图形的数据源和采集信息和我们部署的prometheus版本不一样或者不匹配,我们可以通过修改采集语句的变量来调整。
https://grafana.com/grafana/dashboards/16098-1-node-exporter-for-prometheus-dashboard-cn-0417-job/
比如这个dashboard作者说有一个指标需要单独填写规则
cm:
global:
scrape_interval: 15s
scrape_timeout: 15s
# 新增加规则文件
rule_files:
- 'node_rules.yml'
...
# 新增加以下配置
node_rules.yml: |
groups:
- name: node_usage_record_rules
interval: 1m
rules:
- record: cpu:usage:rate1m
expr: (1 - avg(irate(node_cpu_seconds_total{mode="idle"}[3m])) by (job,instance)) * 100
- record: mem:usage:rate1m
expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100
重新生效后查看prometheus配置:
修改dashboard的图表语句:
quantile_over_time(0.99, cpu:usage:rate1m{origin_prometheus=~"$origin_prometheus",job=~"$job",}[$interval])
quantile_over_time(0.99, mem:usage:rate1m{origin_prometheus=~"$origin_prometheus",job=~"$job"}[$interval])
更新: 2024-09-21 16:14:11