prometheus热更新
为了每次修改配置文件可以热加载prometheus,也就是不停止prometheus,就可以使配置生效,修改了prometheus-cfg.yaml文件中prometheus的配置,通过kubectl apply -f prometheus-cfg.yaml和kubectl apply -f prometheus-deploy.yaml更新资源清单文件,想要使配置生效可用如下热加载命令:
curl -X POST http://10.244.1.66:9090/-/reload
10.244.1.66是prometheus的pod的ip地址,如何查看prometheus的pod ip,可用如下命令:
kubectl get pods -n monitor-sa -o wide | grep prometheus
显示如下, 10.244.1.7就是prometheus的ip
[root@master prometheus]# vim prometheus-cfg.yaml
[root@master prometheus]# kubectl apply -f prometheus-cfg.yaml
configmap/prometheus-config configured[root@master ~]# kubectl get pod -o wide -n monitor
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
grafana-757fcd5f7c-mgbjg 1/1 Running 9 19d 10.233.96.104 node2 <none> <none>
node-exporter-bl92s 1/1 Running 11 21d 192.168.100.7 node2 <none> <none>
node-exporter-dgzlt 1/1 Running 11 21d 192.168.100.6 node1 <none> <none>
node-exporter-lrt46 1/1 Running 11 21d 192.168.100.5 master <none> <none>
prometheus-server-7fb65555b9-bbh4j 1/1 Running 1 10h 10.233.90.79 node1 <none> <none>[root@master ~]# curl -X POST http://10.233.90.79:9090/-/reload
You have new mail in /var/spool/mail/root[root@master ~]# kubectl logs prometheus-server-7fb65555b9-bbh4j -n monitor -f
level=info ts=2021-11-01T13:01:24.926892426Z caller=main.go:588 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
level=info ts=2021-11-01T13:01:24.929509257Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2021-11-01T13:01:24.930321253Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2021-11-01T13:01:24.931003091Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2021-11-01T13:01:24.93168627Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
热加载速度比较慢,可以暴力重启prometheus,如修改上面的prometheus-cfg.yaml文件之后,可执行如下强制删除:
kubectl delete -f prometheus-cfg.yaml
kubectl delete -f prometheus-deploy.yaml
然后再通过apply更新:
kubectl apply -f prometheus-cfg.yaml
kubectl apply -f prometheus-deploy.yaml
注意:
线上最好热加载,暴力删除可能造成监控数据的丢失
更改副本数
[root@master prometheus]# kubectl get pod -n monitor
NAME READY STATUS RESTARTS AGE
grafana-757fcd5f7c-mgbjg 1/1 Running 11 20d
node-exporter-bl92s 1/1 Running 13 22d
node-exporter-dgzlt 1/1 Running 13 22d
node-exporter-lrt46 1/1 Running 13 22d
prometheus-server-7fb65555b9-bbh4j 1/1 Running 3 36h
prometheus-server-7fb65555b9-jd62q 1/1 Running 0 26s[root@master prometheus]# kubectl get ep -n monitor
NAME ENDPOINTS AGE
grafana 10.233.96.123:3000 20d
prometheus 10.233.90.93:9090,10.233.96.126:9090 36h
这两个pod不管访问哪个,访问到的数据都是一样的,这样就可以保证普罗米修斯的高可用了。
[root@master prometheus]# kubectl get pvc -n monitor
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
grafana Bound pvc-828870bc-2a40-4332-ab5a-9d62eb2c6112 5Gi RWX managed-nfs-storage 20d
prometheus Bound pvc-13175f30-cad6-4bc3-ad48-3ee6d170db3e 5Gi RWX managed-nfs-storage 19d
然后你更新配置之后,可以删除两个pod当作的其中一个,这样还是可以访问的,因为做了高可用,另外一个还可以提供服务。 这样就可以一个一个pod删除达到更新配置的效果。(中间要保证新起来的pod处于ready状态再将老的pod给删除掉)