现在我们来创建 prometheus 的 Pod 资源:
# prometheus-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:name: prometheusnamespace: kube-monlabels:app: prometheus
spec:selector:matchLabels:app: prometheustemplate:metadata:labels:app: prometheusspec:serviceAccountName: prometheuscontainers:- image: prom/prometheus:v2.24.1name: prometheusargs:- "--config.file=/etc/prometheus/prometheus.yml"- "--storage.tsdb.path=/prometheus" # 指定tsdb数据路径- "--storage.tsdb.retention.time=24h"- "--web.enable-admin-api" # 控制对admin HTTP API的访问,其中包括删除时间序列等功能- "--web.enable-lifecycle" # 支持热更新,直接执行localhost:9090/-/reload立即生效ports:- containerPort: 9090name: httpvolumeMounts:- mountPath: "/etc/prometheus"name: config-volume- mountPath: "/prometheus"name: dataresources:requests:cpu: 100mmemory: 512Milimits:cpu: 100mmemory: 512Mivolumes:- name: datapersistentVolumeClaim:claimName: prometheus-data- configMap:name: prometheus-configname: config-volume
另外为了 prometheus 的性能和数据持久化我们这里是直接将通过一个 LocalPV 来进行数据持久化的,通过 --storage.tsdb.path=/prometheus
指定数据目录,创建如下所示的一个 PVC 资源对象,注意是一个 LocalPV,和 node3 节点具有亲和性:
apiVersion: v1
kind: PersistentVolume
metadata:name: prometheus-locallabels:app: prometheus
spec:accessModes:- ReadWriteOncecapacity:storage: 20GistorageClassName: local-storagelocal:path: /data/k8s/prometheusnodeAffinity:required:nodeSelectorTerms:- matchExpressions:- key: kubernetes.io/hostnameoperator: Invalues:- node3persistentVolumeReclaimPolicy: Retain
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:name: prometheus-datanamespace: kube-mon
spec:selector:matchLabels:app: prometheusaccessModes:- ReadWriteOnceresources:requests:storage: 20GistorageClassName: local-storage
现在我们就可以添加 promethues 的资源对象了:
$ kubectl apply -f prometheus-deploy.yaml
deployment.apps/prometheus created
$ kubectl get pods -n kube-mon
NAME READY STATUS RESTARTS AGE
prometheus-df4f47d95-vksmc 0/1 CrashLoopBackOff 3 98s
$ kubectl logs -f prometheus-df4f47d95-vksmc -n kube-mon
level=info ts=2019-12-12T03:08:49.424Z caller=main.go:332 msg="Starting Prometheus" version="(version=2.14.0, branch=HEAD, revision=edeb7a44cbf745f1d8be4ea6f215e79e651bfe19)"
level=info ts=2019-12-12T03:08:49.424Z caller=main.go:333 build_context="(go=go1.13.4, user=root@df2327081015, date=20191111-14:27:12)"
level=info ts=2019-12-12T03:08:49.425Z caller=main.go:334 host_details="(Linux 3.10.0-1062.4.1.el7.x86_64 #1 SMP Fri Oct 18 17:15:30 UTC 2019 x86_64 prometheus-df4f47d95-vksmc (none))"
level=info ts=2019-12-12T03:08:49.425Z caller=main.go:335 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2019-12-12T03:08:49.425Z caller=main.go:336 vm_limits="(soft=unlimited, hard=unlimited)"
level=error ts=2019-12-12T03:08:49.425Z caller=query_logger.go:85 component=activeQueryTracker msg="Error opening query log file" file=/prometheus/queries.active err="open /prometheus/queries.active: permission denied"
panic: Unable to create mmap-ed active query loggoroutine 1 [running]:
github.com/prometheus/prometheus/promql.NewActiveQueryTracker(0x7ffd8cf6ec5d, 0xb, 0x14, 0x2b4f400, 0xc0006f33b0, 0x2b4f400)/app/promql/query_logger.go:115 +0x48c
main.main()/app/cmd/prometheus/main.go:364 +0x5229
创建 Pod 后,我们可以看到并没有成功运行,出现了 open /prometheus/queries.active: permission denied
这样的错误信息,这是因为我们的 prometheus 的镜像中是使用的 nobody 这个用户,然后现在我们通过 LocalPV 挂载到宿主机上面的目录的 ownership
却是 root
:
$ ls -la /data/k8s
total 36
drwxr-xr-x 6 root root 4096 Dec 12 11:07 .
dr-xr-xr-x. 19 root root 4096 Nov 9 23:19 ..
drwxr-xr-x 2 root root 4096 Dec 12 11:07 prometheus
[root@master persistentvolume]# kubectl exec -it prometheus-server-5775f99578-5gm5f -n monitor sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/prometheus $ id
uid=65534(nobody) gid=65534(nogroup)/prometheus $ ls -la /prometheus/
total 16
drwxr-xr-x 3 nobody nogroup 4096 Jan 17 15:31 .
drwxr-xr-x 1 root root 4096 Jan 17 15:31 ..
-rw------- 1 nobody nogroup 2 Jan 17 15:31 lock
drwxr-xr-x 2 nobody nogroup 4096 Jan 17 15:31 wal
所以当然会出现操作权限问题了,这个时候我们就可以通过 securityContext
来为 Pod 设置下 volumes 的权限,通过设置 runAsUser=0
指定运行的用户为 root,也可以通过设置一个 initContainer 来修改数据目录权限:
......
initContainers:
- name: fix-permissionsimage: busyboxcommand: [chown, -R, "nobody:nobody", /prometheus]volumeMounts:- name: datamountPath: /prometheus
这个时候我们重新更新下 prometheus:
$ kubectl apply -f prometheus-deploy.yaml
deployment.apps/prometheus configured
$ kubectl get pods -n kube-mon
NAME READY STATUS RESTARTS AGE
prometheus-79b8774f68-7m8zr 1/1 Running 0 56s
$ kubectl logs -f prometheus-79b8774f68-7m8zr -n kube-mon
level=info ts=2019-12-12T03:17:44.228Z caller=main.go:332 msg="Starting Prometheus" version="(version=2.14.0, branch=HEAD, revision=edeb7a44cbf745f1d8be4ea6f215e79e651bfe19)"
......
level=info ts=2019-12-12T03:17:44.822Z caller=main.go:673 msg="TSDB started"
level=info ts=2019-12-12T03:17:44.822Z caller=main.go:743 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
level=info ts=2019-12-12T03:17:44.827Z caller=main.go:771 msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml
level=info ts=2019-12-12T03:17:44.827Z caller=main.go:626 msg="Server is ready to receive web requests."