问题描述:
采用ansible 二进制方式部署kubernetes, 部署完成后kubectl get node
节点状态 NotReady
kubectl get pod -n kube-system
发现 calico 处于 Pending 状态
查看pod日志
kubectl logs -f calico-node-7fw9h -n kube-system
报错为:
Authorization error (user=kubernetes, verb=get, resource=nodes, subresource=proxy)
systemctl status kubelet -l
发现报错
Feb 11 09:56:48 kubernetesM02 kubelet[20889]: E0211 09:56:48.153429 20889 kubelet_node_status.go:93] “Unable to register node with API server” err=“Unauthorized” node=“kubernetesm02”
Feb 11 09:56:50 kubernetesM02 kubelet[20889]: E0211 09:56:50.097095 20889 reflector.go:138] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:45: Failed to watch *v1.Pod: failed to list *v1.Pod: Unauthorized
怀疑是kube-apiserver 问题
systemctl status kube-apiserver -l
发现报错
Feb 11 14:34:11 kubernetesM02 kube-apiserver[16692]: E0211 14:34:11.507411 16692 authentication.go:63] “Unable to authenticate the request” err=“[x509: certificate signed by unknown authority (possibly because of “crypto/rsa: verification error” while trying to verify candidate authority certificate “kubernetes”), verifying certificate SN=223578824471382834555747287975620980748, SKID=, AKID=F4:9A:82:54:A9:67:BB:64:A7:F6:E9:33:A4:A0:C2:4B:6A:A8:D0:1B failed: x509: certificate signed by unknown authority (possibly because of “crypto/rsa: verification error” while trying to verify candidate authority certificate “kubernetes”)]”
百度后怀疑是证书问题
解决方法:
删除kubelet 的证书
rm /etc/kubernetes/ssl/kubelet.*
重启kubelet
如果引入了 bootstrap 机制,会自动重新生成并颁发证书,否则需要手动颁发证书
kubectl get node 再次查看节点状态已正常
手动办法证书方法如下:
kubectl get csr #查看证书请求
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-PJit5dDGv1tO3e3ELjBXu5ORF4NtL8K7qkS2AM2KE7g 3m6s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issuedkubectl certificate approve node-csr-PJit5dDGv1tO3e3ELjBXu5ORF4NtL8K7qkS2AM2KE7g #批准证书请求
certificatesigningrequest.certificates.k8s.io/node-csr-PJit5dDGv1tO3e3ELjBXu5ORF4NtL8K7qkS2AM2KE7g approved
原因分析:
kubelet bootstrap 引导出错导致kube-apiserve 和 kubelet 之前自动证书审批未完成,导致两者之间未建立连接
删除kubelet证书并重启kubelet 让 kubelet bootstrap 重新引导完成自动证书审批工作 ,问题解决