openeuler 22.03 lts sp4 使用 kubeadm 部署 k8s-v1.28.2 高可用集群
文章目录
- @[toc]
- 废话篇
- 这篇文章什么时候写的
- 为什么是 openeuler
- 为什么是 22.03 lts sp4
- 高可用架构
- 题外话
- 干活篇
- 环境介绍
- 系统初始化相关
- 关闭防火墙
- 关闭 selinux
- 关闭 swap
- 开启内核模块
- 开启模块自动加载服务
- sysctl 内核参数调整
- 清空 iptables 规则
- 安装各种依赖和工具
- 修改 .bashrc 文件
- 安装 kubeadm 和 kubelet
- 简化 kubectl 命令
- 启动 kubelet
- 安装 containerd
- 镜像准备
- 部署 master 组件
- 集群初始化
- 安装 calico 网络插件
- 其他 master 节点加入集群
- 安装 nginx
- 安装 keepalived
- 构建 keepalived 镜像
- 切换成高可用访问
- 修改 controlPlaneEndpoint
- 修改 kubeconfig 证书
- 重启 master 组件
- 修改 kube-proxy 配置
- worker 节点加入集群
- 更新十年证书
- 模拟节点故障
- ABC 三类地址总结
文章目录
- @[toc]
- 废话篇
- 这篇文章什么时候写的
- 为什么是 openeuler
- 为什么是 22.03 lts sp4
- 高可用架构
- 题外话
- 干活篇
- 环境介绍
- 系统初始化相关
- 关闭防火墙
- 关闭 selinux
- 关闭 swap
- 开启内核模块
- 开启模块自动加载服务
- sysctl 内核参数调整
- 清空 iptables 规则
- 安装各种依赖和工具
- 修改 .bashrc 文件
- 安装 kubeadm 和 kubelet
- 简化 kubectl 命令
- 启动 kubelet
- 安装 containerd
- 镜像准备
- 部署 master 组件
- 集群初始化
- 安装 calico 网络插件
- 其他 master 节点加入集群
- 安装 nginx
- 安装 keepalived
- 构建 keepalived 镜像
- 切换成高可用访问
- 修改 controlPlaneEndpoint
- 修改 kubeconfig 证书
- 重启 master 组件
- 修改 kube-proxy 配置
- worker 节点加入集群
- 更新十年证书
- 模拟节点故障
- ABC 三类地址总结
废话篇
这篇文章什么时候写的
北京时间:
2024年9月
为什么是 openeuler
centos 7
已经于2024 年 06 月 30 日停止维护
,国内又信创热潮,对于后期来说,谁也不知道形势会发生什么样的变化
- 目前国产操作系统有:
openeuler(华为欧拉)
,anolis OS(阿里龙蜥)
,OpenCloudOS(腾讯)
,UOS(统信)
,kylin OS(银河麒麟/商业版的,开源版是 openkylin)
- 至于为什么选择了
openeuler
,因为目前为止,只有openeuler
不仅有 iso 镜像,还有wsl
,docker 镜像
,甚至还支持公有云镜像,现在国内都没法直接访问 dockerhub 了,谁也不知道以后是不是连 docker 的基础镜像也会有干预,先提前有个准备
为什么是 22.03 lts sp4
因为
22.03 lts sp4
是2024年6月份
的最新版本,生命周期也是持续到 2026 年的
高可用架构
- 如果是公有云服务器,可以直接买公有云的 lb 服务就好了,简单粗暴有人抓
- 如果是本地私有化,我这边使用的是 keepalived+nginx(stream 4层负载) 的架构来实现 apiserver 的高可用
本次实验是以容器的形式来部署 nginx 和 keepalived,主要目的是为了减少不同环境差异导致部署方式不同
- 下面的丑图来解释一下 ha 的场景
- keepalived 使用 backup 的模式部署
- VIP 所在机器的 keepalived 对当前节点的 nginx 做健康检测,通过对应端口负载到背后的 apiserver 服务
- 使用 nginx 的 steam 是为了节省机器的资源开支,用 upstream 属于七层负载,相较而言,资源使用会更高
题外话
当时本来想用静态 pod 的方式来运行 nginx 和 keepalived,后来发现,静态 pod 不支持 API 对象,只能放弃了,具体的查看
创建静态 Pod
- 下面的这个部署方式,也就适合测试环境使用,生产环境,不建议把高可用组件放到同一个 k8s 集群里面,最好是外面独立部署,包括 etcd 也可以考虑外置
干活篇
环境介绍
组件 | 版本 |
---|---|
OS | openEuler 22.03 (LTS-SP4) |
containerd | 1.6.33 |
k8s | 1.28.2-0 |
nerdctl | 1.7.6 |
nginx | 1.26.0 |
keepalived | 2.3.1 |
机器 ip 和对应的服务
IP | HOSTNAME | SERVICE/ROLE |
---|---|---|
192.168.22.111 | manager-k8s-cluster-01 | k8s-master+k8s-worker+keepalived+nginx |
192.168.22.112 | manager-k8s-cluster-02 | k8s-master+k8s-worker+keepalived+nginx |
192.168.22.113 | manager-k8s-cluster-03 | k8s-master+k8s-worker+keepalived+nginx |
192.168.22.114 | manager-k8s-cluster-04 | k8s-worker |
192.168.22.115 | manager-k8s-cluster-05 | k8s-worker |
192.168.22.200 | / | VIP |
系统初始化相关
- 如果是虚拟机还没就绪,可以先启动一台机器,执行完初始化后,直接克隆机器更方便快捷
- 如果机器已经就绪了,下面的初始化操作,每个机器都需要执行
- 下面的操作省略了
静态 ip
和时间同步
的操作,大家自己操作一下
关闭防火墙
systemctl disable firewalld --now
关闭 selinux
setenforce 0
sed -i '/SELINUX/s/enforcing/disabled/g' /etc/selinux/config
关闭 swap
swapoff -a
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
开启内核模块
# 针对于 kubeproxy 使用 ipvs 模式的
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
# 常规要开启的
modprobe nf_conntrack
modprobe br_netfilter
modprobe overlay
开启模块自动加载服务
cat > /etc/modules-load.d/k8s-modules.conf <<EOF
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
br_netfilter
overlay
EOF
设置为开机自启
systemctl enable systemd-modules-load --now
sysctl 内核参数调整
cat <<EOF > /etc/sysctl.d/kubernetes.conf
# 开启数据包转发功能(实现vxlan)
net.ipv4.ip_forward=1
# iptables对bridge的数据进行处理
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-arptables=1
# 不允许将TIME-WAIT sockets重新用于新的TCP连接
net.ipv4.tcp_tw_reuse=0
# socket监听(listen)的backlog上限
net.core.somaxconn=32768
# 最大跟踪连接数,默认 nf_conntrack_buckets * 4
net.netfilter.nf_conntrack_max=1000000
# 禁止使用 swap 空间,只有当系统 OOM 时才允许使用它
vm.swappiness=0
# 计算当前的内存映射文件数。
vm.max_map_count=655360
# 内核可分配的最大文件数
fs.file-max=6553600
# 持久连接
net.ipv4.tcp_keepalive_time=600
net.ipv4.tcp_keepalive_intvl=30
net.ipv4.tcp_keepalive_probes=10
EOF
立即生效
sysctl -p /etc/sysctl.d/kubernetes.conf
清空 iptables 规则
iptables -F && \
iptables -X && \
iptables -F -t nat && \
iptables -X -t nat && \
iptables -P FORWARD ACCEPT
安装各种依赖和工具
yum install -y vim wget tar net-tools jq bash-completion tree bind-utils telnet unzip nc
修改 .bashrc 文件
具体参考我之前的博客:关于 openeuler 22.03-LTS-SP4 scp 失败问题的记录,主要影响的是 scp 命令,具体的,看大家自己选择
安装 kubeadm 和 kubelet
k8s 官方也没有 openeuler 的源,但是可以直接使用
kubernetes-el7
的源来安装,下面是配置kubernetes-el7
源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
安装 kubeadm 的时候,会自动安装 kubelet 和 kubectl 以及一些依赖的组件
yum install -y kubeadm-1.28.2-0
验证版本
kubeadm version
正常返回下面的内容,说明没问题
kubeadm version: &version.Info{Major:"1", Minor:"28", GitVersion:"v1.28.2", GitCommit:"89a4ea3e1e4ddd7f7572286090359983e0387b2f", GitTreeState:"clean", BuildDate:"2023-09-13T09:34:32Z", GoVersion:"go1.20.8", Compiler:"gc", Platform:"linux/amd64"}
简化 kubectl 命令
有时候实在懒得敲 kubectl 了,只想敲一个 k
ln -s /usr/bin/kubectl /usr/bin/k
启动 kubelet
配置开机自启
systemctl enable kubelet --now
安装 containerd
openeuler 可以用 docker 的 centos 里面的 rpm 来安装,这一点,还是比较方便的
cat <<EOF> /etc/yum.repos.d/docker.repo
[docker-ce-centos]
name=Docker CE Stable centos
baseurl=https://mirrors.aliyun.com/docker-ce/linux/centos/7.9/x86_64/stable
enabled=1
gpgcheck=1
gpgkey=https://mirrors.aliyun.com/docker-ce/linux/centos/gpg
EOF
安装 containerd
yum install -y containerd.io-1.6.33
生成默认的配置文件
containerd config default > /etc/containerd/config.toml
别的配置大家可以根据实际情况修改,国内的话,有一个参数可以修改,也可以不修改
sandbox_image
这个参数要指定pause
镜像,默认的是registry.k8s.io/pause:3.6
,可以自己提前准备好镜像,然后修改成这个 tag,也可以和我一样,替换成国内阿里的SystemdCgroup = false
这个参数需要修改,因为后面的 kubelet 也是用 systemd 这个 cgroup,默认导出的配置是 false,不配置会有下面的报错
openat2 /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf2248c8a5ab6855d0410a9f38c37b4a0.slice/cpuset.mems: no such file or directory
sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9"
SystemdCgroup = true
启动
containerd
,加入开机自启
systemctl enable containerd --now
配置
crictl
命令(安装 kubeadm 的时候,默认把 crictl 命令作为依赖下载了,需要通过配置文件,让crictl
命令读取containerd
的socket
文件来达到管理containerd
的目的)
crictl
命令默认的配置文件是/etc/crictl.yaml
,也可以自定义,使用crictl
命令的时候加上--config
来指定配置文件就可以了
echo 'runtime-endpoint: unix:///run/containerd/containerd.sock' > /etc/crictl.yaml
检查 crictl 和 containerd 的版本
crictl version
能展示下面的版本信息,说明部署和启动都没有问题了
Version: 0.1.0
RuntimeName: containerd
RuntimeVersion: 1.6.33
RuntimeApiVersion: v1
镜像准备
kubeadm 部署需要用到镜像,如果是内网环境,需要提前准备好镜像,然后导入镜像,用下面的命令可以查看需要提前准备哪些镜像
image-repository
就是后面 kubeadm 配置文件里面指定的,国内可以用下面的阿里云kubernetes-version
是指定 k8s 的版本
kubeadm config images list \
--image-repository registry.cn-hangzhou.aliyuncs.com/google_containers \
--kubernetes-version 1.28.2
正常情况下,会输出下面这些内容
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.28.2
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.28.2
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.28.2
registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9
registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.9-0
registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.10.1
如果当前环境有网,网络可能不是很好,也可以提前用下面的命令先把镜像拉下来,这样不会在初始化阶段超时报错
kubeadm config images pull \
--image-repository registry.cn-hangzhou.aliyuncs.com/google_containers \
--kubernetes-version 1.28.2
拉取过程也会有下面这样的输出,到 coredns 说明镜像都拉取好了
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.28.2
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.28.2
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.28.2
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.9-0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.10.1
也可以提前把 calico 镜像准备好
ctr -n k8s.io image pull docker.io/calico/cni:v3.28.1
ctr -n k8s.io image pull docker.io/calico/node:v3.28.1
ctr -n k8s.io image pull docker.io/calico/kube-controllers:v3.28.1
初始化的操作,到这里就结束了
部署 master 组件
集群初始化
准备初始化的配置文件,相关的配置文件,可以从官方获取:Configuration APIs
# 集群相关的一些配置
## https://kubernetes.io/docs/reference/config-api/kubeadm-config.v1beta3/
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
# apiserver 服务的 ip 地址和端口
advertiseAddress: 192.168.22.111
bindPort: 6443
nodeRegistration:
# 容器运行时的选择
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
# k8s 的节点名称,也就是以后 kubectl get nodes 查看的名字
## 不指定的话,一般都是直接读取本机的 hostname
## 这里看个人习惯
name: 192.168.22.111
# 节点污点相关的,根据自己的情况配置
taints: null
---
apiServer:
# 高可用涉及到的 ip 地址属于额外的配置
## 需要在初始化的时候,加入到证书的 ip 清单里面
certSANs:
- 192.168.22.200
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
# k8s 相关证书的目录
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
# apiserver 的访问地址,先写当前节点的 ip
controlPlaneEndpoint: 192.168.22.111:6443
controllerManager: {}
dns: {}
etcd:
local:
# etcd 的数据持久化目录,尽量放 ssd 固态盘上面,etcd 比较在意磁盘 io
dataDir: /var/lib/etcd
# 镜像仓库地址,官方默认是 registry.k8s.io,咱们国内可以写阿里的
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.28.2
networking:
# k8s dns 解析的域
dnsDomain: cluster.local
# k8s service 的网段
serviceSubnet: 10.96.0.0/12
# k8s pod 的网段
## 文章最后处会整理一下 ABC 三类地址的范围
podSubnet: 172.22.0.0/16
scheduler: {}
# kubelet 相关的配置
## https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
# 指定 cgroup 是 systemd
cgroupDriver: systemd
cgroupsPerQOS: true
# 配置容器的日志轮转
## 配置容器日志达到多少大小开始轮转,默认是 10Mi
containerLogMaxSize: 100Mi
## 配置容器日志轮转的最大文件数量,默认是 5
containerLogMaxFiles: 5
# kube-proxy 相关的配置
## https://kubernetes.io/docs/reference/config-api/kube-proxy-config.v1alpha1/
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
# 代理模式,只有 iptables 和 ipvs 可选,windows 用 kernelspace
mode: iptables
通过配置文件初始化集群
kubeadm init --config kubeadm.yaml
返回类似下面的内容,说明集群已经初始化完成了
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
kubeadm join 192.168.22.111:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25 \
--control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.22.111:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25
安装 calico 网络插件
官方的 yaml 在 github 上有,可以 clone 到本地,然后获取到这个 yaml 文件,里面有内容需要修改
- 取消
CALICO_IPV4POOL_CIDR
的注释,值就是 kubeadm 初始化文件里面的podSubnet
- 增加
IP_AUTODETECTION_METHOD
指定一下网卡,如果本地多网卡,可能会有一些不知名的问题存在
- name: CALICO_IPV4POOL_CIDR
value: "172.22.0.0/16"
- name: IP_AUTODETECTION_METHOD
value: "interface=ens3"
apply 完 yaml 后,检查 pod 是否都正常启动
kubectl get pod -n kube-system
calico 这些都是 running 的就可以了
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-97d84d657-bjlkx 1/1 Running 0 29s
calico-node-gppdv 1/1 Running 0 29s
验证集群是否正常
kubectl get nodes
节点都是 ready 就可以了
NAME STATUS ROLES AGE VERSION
192.168.22.111 Ready control-plane 9m29s v1.28.2
其他 master 节点加入集群
剩下的 master 节点都要执行
mkdir -p /etc/kubernetes/pki/etcd
分发证书
# 分发给 192.168.22.112 节点
scp /etc/kubernetes/pki/{ca.crt,ca.key,sa.key,sa.pub,front-proxy-ca.crt,front-proxy-ca.key} 192.168.22.112:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/{ca.crt,ca.key} 192.168.22.112:/etc/kubernetes/pki/etcd/
# 分发给 192.168.22.113 节点
scp /etc/kubernetes/pki/{ca.crt,ca.key,sa.key,sa.pub,front-proxy-ca.crt,front-proxy-ca.key} 192.168.22.113:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/{ca.crt,ca.key} 192.168.22.113:/etc/kubernetes/pki/etcd/
通过上面初始化完成后给出的 join 命令来加入,下面的命令,分别在各自的 master 节点执行
# 192.168.22.112 节点执行
kubeadm join 192.168.22.111:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25 \
--control-plane \
--node-name 192.168.22.112
# 192.168.22.113 节点执行
kubeadm join 192.168.22.111:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25 \
--control-plane \
--node-name 192.168.22.113
返回类似下面的输出,说明加入集群成功了
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
安装 nginx
这个 nginx 是用来对 apisever 做负载均衡的,这里先把节点的污点去了,不然 master 节点没法运行 pod
kubectl taint node --all node-role.kubernetes.io/control-plane-
创建 namespace
k create ns ha
下面是完整的 yaml 文件,包括了 namespace,configmap,deployment,里面的 ip 要注意换成自己本地的 ip,我这里就启动了两个 nginx
---
apiVersion: v1
data:
nginx.conf: |
worker_processes 1;
events {
worker_connections 1024;
}
stream {
upstream k8s-apiserver {
hash $remote_addr consistent;
server 192.168.22.111:6443 max_fails=3 fail_timeout=30s;
server 192.168.22.112:6443 max_fails=3 fail_timeout=30s;
server 192.168.22.113:6443 max_fails=3 fail_timeout=30s;
}
server {
listen *:8443;
proxy_connect_timeout 120s;
proxy_pass k8s-apiserver;
}
}
kind: ConfigMap
metadata:
name: nginx-lb-apiserver-cm
namespace: ha
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx-lb-apiserver
name: nginx-lb-apiserver
namespace: ha
spec:
replicas: 3
selector:
matchLabels:
app: nginx-lb-apiserver
template:
metadata:
labels:
app: nginx-lb-apiserver
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- 192.168.22.111
- 192.168.22.112
- 192.168.22.113
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- nginx-lb-apiserver
topologyKey: kubernetes.io/hostname
containers:
- image: docker.m.daocloud.io/nginx:1.26.0
imagePullPolicy: IfNotPresent
name: nginx-lb-apiserver
resources:
limits:
cpu: 1000m
memory: 500Mi
requests:
cpu: 100m
memory: 100Mi
volumeMounts:
- mountPath: /etc/nginx
name: config
hostNetwork: true
volumes:
- configMap:
name: nginx-lb-apiserver-cm
name: config
安装 keepalived
高可用组件采用 k8s pod 的形式部署,containerd 构建镜像比较麻烦,可以找个有 docker 的环境构建,然后导入镜像
构建 keepalived 镜像
Dockerfile
FROM docker.m.daocloud.io/debian:stable-20240904-slim
ENV LANG="en_US.UTF-8"
ENV LANGUAGE="en_US:en"
ENV LC_ALL="en_US.UTF-8"
ENV KEEPALIVED_VERSION="2.3.1"
RUN sed -i.bak 's/deb.debian.org/mirrors.aliyun.com/g' /etc/apt/sources.list.d/debian.sources && \
apt-get update && \
apt-get install -y autoconf \
make \
curl \
gcc \
ipset \
iptables \
musl-dev \
openssl \
libssl-dev \
net-tools \
ncat \
&& curl -o keepalived.tar.gz -SL https://keepalived.org/software/keepalived-${KEEPALIVED_VERSION}.tar.gz \
&& tar xf keepalived.tar.gz \
&& cd keepalived-${KEEPALIVED_VERSION} \
&& ./configure --disable-dynamic-linking \
&& make \
&& make install \
&& rm -f /keepalived.tar.gz \
&& apt-get remove -y musl-dev \
libssl-dev \
make \
&& apt-get -ys clean
构建镜像
docker build -t keepalived-2.3.1:debian-20240904-slim .
导出镜像
docker save keepalived-2.3.1:debian-20240904-slim > keepalived-2.3.1_debian-20240904-slim.tar
分发镜像到 master 节点
scp keepalived-2.3.1_debian-20240904-slim.tar 192.168.22.111:/tmp/
scp keepalived-2.3.1_debian-20240904-slim.tar 192.168.22.112:/tmp/
scp keepalived-2.3.1_debian-20240904-slim.tar 192.168.22.113:/tmp/
k8s 集群导入镜像
ctr -n k8s.io image import /tmp/keepalived-2.3.1_debian-20240904-slim.tar
下面的 yaml 里面的 ip 地址和 ens3 网卡名字修改成自己环境的就可以了,下面的 image 名字也是上面构建的镜像名字,如果有不一样的,也要修改
---
apiVersion: v1
data:
keepalived.conf: |
global_defs {
}
vrrp_script chk_nginx {
script "/etc/keepalived/chk_health/chk_nginx.sh"
interval 2
fall 3
rise 2
timeout 3
}
vrrp_instance VI_1 {
state BACKUP
interface ens3
virtual_router_id 100
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass keep@lived
}
virtual_ipaddress {
192.168.22.200
}
track_interface {
ens3
}
nopreempt
track_script {
chk_nginx
}
}
kind: ConfigMap
metadata:
name: keepalived-ha-apiserver-cm
namespace: ha
---
apiVersion: v1
data:
chk_nginx.sh: |
#!/bin/bash
exitNum=0
while true
do
if ! nc -z 127.0.0.1 8443; then
let exitNum++
sleep 3
[ ${exitNum} -lt 3 ] || exit 1
else
exit 0
fi
done
kind: ConfigMap
metadata:
name: keepalived-ha-apiserver-chk-cm
namespace: ha
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: keepalived-ha-apiserver
name: keepalived-ha-apiserver
namespace: ha
spec:
replicas: 3
selector:
matchLabels:
app: keepalived-ha-apiserver
template:
metadata:
labels:
app: keepalived-ha-apiserver
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- 192.168.22.111
- 192.168.22.112
- 192.168.22.113
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- keepalived-ha-apiserver
topologyKey: kubernetes.io/hostname
containers:
- args:
- -c
- /usr/local/sbin/keepalived -n -l -f /etc/keepalived/keepalived.conf
command:
- bash
image: keepalived-2.3.1:debian-20240904-slim
imagePullPolicy: IfNotPresent
name: keepalived-ha-apiserver
securityContext:
capabilities:
add:
- NET_ADMIN
volumeMounts:
- mountPath: /etc/keepalived
name: config
- mountPath: /etc/keepalived/chk_health
name: chekscript
hostNetwork: true
volumes:
- configMap:
name: keepalived-ha-apiserver-cm
name: config
- configMap:
defaultMode: 493
name: keepalived-ha-apiserver-chk-cm
name: chekscript
验证 vip 是否通了,这里使用 vip+nginx 的端口来验证
nc -zv 192.168.22.200 8443
返回类似下面的输出说明网络是通的
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Connected to 192.168.22.200:8443.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.
切换成高可用访问
修改 controlPlaneEndpoint
在 init 节点操作以下命令修改
kubeadm-config
里面的controlPlaneEndpoint
地址
kubectl edit cm -n kube-system kubeadm-config
修改成 vip 的地址
controlPlaneEndpoint: 192.168.22.200:8443
修改 kubeconfig 证书
以下操作,三个 master 节点都需要执行
替换 kubeconfig 文件中的 apiserver 地址,可以注释掉老的,然后写一个新的
admin.conf
controller-manager.conf
kubelet.conf
scheduler.conf
server: https://192.168.22.200:8443
# server: https://192.168.22.111:6443
重启 master 组件
只需要逐个节点重启
controller-manager
,scheduler
,kubelet
来验证
mv /etc/kubernetes/manifests/kube-scheduler.yaml .
# 可以等待一会,或者执行 crictl ps 看看是否有 scheduler 的容器存在
mv kube-scheduler.yaml /etc/kubernetes/manifests/
mv /etc/kubernetes/manifests/kube-controller-manager.yaml .
# 可以等待一会,或者执行 crictl ps 看看是否有 scheduler 的容器存在
mv kube-controller-manager.yaml /etc/kubernetes/manifests/
systemctl restart kubelet
重启完成后,执行下面的命令,看看是否能正常获取节点信息,可以正常获取就没问题了
kubectl get node --kubeconfig /etc/kubernetes/admin.conf
修改 kube-proxy 配置
用其中一个节点操作就可以了
k edit cm -n kube-system kube-proxy
主要修改 server 这块
kubeconfig.conf: |-
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
server: https://192.168.22.200:8443
# server: https://192.168.22.111:6443
name: default
重启 kube-proxy
k get pod -n kube-system | awk '/kube-proxy/ {print $1}' | xargs k delete pod -n kube-system
worker 节点加入集群
生成 join 命令
kubeadm token create --print-join-command --ttl=0
正常情况会返回类似下面的内容,就可以在其他的 worker 节点去执行了
kubeadm join 192.168.22.200:8443 --token lkuzqm.0kzboiy72rryp3fb --discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25
下面是我加入的两个节点
kubeadm join 192.168.22.200:8443 --token lkuzqm.0kzboiy72rryp3fb \
--discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25 \
--node-name 192.168.22.114
kubeadm join 192.168.22.200:8443 --token lkuzqm.0kzboiy72rryp3fb \
--discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25 \
--node-name 192.168.22.115
更新十年证书
这里使用 github 上一个大佬写的脚本来更新证书,不采用编译 kubeadm 的方式,相对方便很多
- update-kube-cert
- 这个脚本的简要逻辑就是从当前集群的 k8s 证书里面通过 openssl 命令去读取一些相关的内容,基于之前的 ca 根证书(kubeadm 默认的 ca 证书是10年的,只是各个组件的证书只配置了一年)来重新生成各个组件的证书
- 如果访问 github 有问题的,可以直接复制下面的脚本内容
#!/usr/bin/env bash
set -o errexit
set -o pipefail
# set -o xtrace
# set output color
NC='\033[0m'
RED='\033[31m'
GREEN='\033[32m'
YELLOW='\033[33m'
BLUE='\033[34m'
# set default cri
CRI="docker"
log::err() {
printf "[$(date +'%Y-%m-%dT%H:%M:%S.%2N%z')][${RED}ERROR${NC}] %b\n" "$@"
}
log::info() {
printf "[$(date +'%Y-%m-%dT%H:%M:%S.%2N%z')][INFO] %b\n" "$@"
}
log::warning() {
printf "[$(date +'%Y-%m-%dT%H:%M:%S.%2N%z')][${YELLOW}WARNING${NC}] \033[0m%b\n" "$@"
}
check_file() {
if [[ ! -r ${1} ]]; then
log::err "can not find ${1}"
exit 1
fi
}
# get x509v3 subject alternative name from the old certificate
cert::get_subject_alt_name() {
local cert=${1}.crt
local alt_name
check_file "${cert}"
alt_name=$(openssl x509 -text -noout -in "${cert}" | grep -A1 'Alternative' | tail -n1 | sed 's/[[:space:]]*Address//g')
printf "%s\n" "${alt_name}"
}
# get subject from the old certificate
cert::get_subj() {
local cert=${1}.crt
local subj
check_file "${cert}"
subj=$(openssl x509 -text -noout -in "${cert}" | grep "Subject:" | sed 's/Subject:/\//g;s/\,/\//;s/[[:space:]]//g')
printf "%s\n" "${subj}"
}
cert::backup_file() {
local file=${1}
if [[ ! -e ${file}.old-$(date +%Y%m%d) ]]; then
cp -rp "${file}" "${file}.old-$(date +%Y%m%d)"
log::info "backup ${file} to ${file}.old-$(date +%Y%m%d)"
else
log::warning "does not backup, ${file}.old-$(date +%Y%m%d) already exists"
fi
}
# check certificate expiration
cert::check_cert_expiration() {
local cert=${1}.crt
local cert_expires
cert_expires=$(openssl x509 -text -noout -in "${cert}" | awk -F ": " '/Not After/{print$2}')
printf "%s\n" "${cert_expires}"
}
# check kubeconfig expiration
cert::check_kubeconfig_expiration() {
local config=${1}.conf
local cert
local cert_expires
cert=$(grep "client-certificate-data" "${config}" | awk '{print$2}' | base64 -d)
cert_expires=$(openssl x509 -text -noout -in <(printf "%s" "${cert}") | awk -F ": " '/Not After/{print$2}')
printf "%s\n" "${cert_expires}"
}
# check etcd certificates expiration
cert::check_etcd_certs_expiration() {
local cert
local certs
certs=(
"${ETCD_CERT_CA}"
"${ETCD_CERT_SERVER}"
"${ETCD_CERT_PEER}"
"${ETCD_CERT_HEALTHCHECK_CLIENT}"
"${ETCD_CERT_APISERVER_ETCD_CLIENT}"
)
for cert in "${certs[@]}"; do
if [[ ! -r ${cert} ]]; then
printf "%-50s%-30s\n" "${cert}.crt" "$(cert::check_cert_expiration "${cert}")"
fi
done
}
# check master certificates expiration
cert::check_master_certs_expiration() {
local certs
local kubeconfs
local cert
local conf
certs=(
"${CERT_CA}"
"${CERT_APISERVER}"
"${CERT_APISERVER_KUBELET_CLIENT}"
"${FRONT_PROXY_CA}"
"${FRONT_PROXY_CLIENT}"
)
# add support for super_admin.conf, which was added after k8s v1.30.
if [ -f "${CONF_SUPER_ADMIN}.conf" ]; then
kubeconfs=(
"${CONF_CONTROLLER_MANAGER}"
"${CONF_SCHEDULER}"
"${CONF_ADMIN}"
"${CONF_SUPER_ADMIN}"
)
else
kubeconfs=(
"${CONF_CONTROLLER_MANAGER}"
"${CONF_SCHEDULER}"
"${CONF_ADMIN}"
)
fi
printf "%-50s%-30s\n" "CERTIFICATE" "EXPIRES"
for conf in "${kubeconfs[@]}"; do
if [[ ! -r ${conf} ]]; then
printf "%-50s%-30s\n" "${conf}.config" "$(cert::check_kubeconfig_expiration "${conf}")"
fi
done
for cert in "${certs[@]}"; do
if [[ ! -r ${cert} ]]; then
printf "%-50s%-30s\n" "${cert}.crt" "$(cert::check_cert_expiration "${cert}")"
fi
done
}
# check all certificates expiration
cert::check_all_expiration() {
cert::check_master_certs_expiration
cert::check_etcd_certs_expiration
}
# generate certificate whit client, server or peer
# Args:
# $1 (the name of certificate)
# $2 (the type of certificate, must be one of client, server, peer)
# $3 (the subject of certificates)
# $4 (the validity of certificates) (days)
# $5 (the name of ca)
# $6 (the x509v3 subject alternative name of certificate when the type of certificate is server or peer)
cert::gen_cert() {
local cert_name=${1}
local cert_type=${2}
local subj=${3}
local cert_days=${4}
local ca_name=${5}
local alt_name=${6}
local ca_cert=${ca_name}.crt
local ca_key=${ca_name}.key
local cert=${cert_name}.crt
local key=${cert_name}.key
local csr=${cert_name}.csr
local common_csr_conf='distinguished_name = dn\n[dn]\n[v3_ext]\nkeyUsage = critical, digitalSignature, keyEncipherment\n'
for file in "${ca_cert}" "${ca_key}" "${cert}" "${key}"; do
check_file "${file}"
done
case "${cert_type}" in
client)
csr_conf=$(printf "%bextendedKeyUsage = clientAuth\n" "${common_csr_conf}")
;;
server)
csr_conf=$(printf "%bextendedKeyUsage = serverAuth\nsubjectAltName = %b\n" "${common_csr_conf}" "${alt_name}")
;;
peer)
csr_conf=$(printf "%bextendedKeyUsage = serverAuth, clientAuth\nsubjectAltName = %b\n" "${common_csr_conf}" "${alt_name}")
;;
*)
log::err "unknow, unsupported certs type: ${YELLOW}${cert_type}${NC}, supported type: client, server, peer"
exit 1
;;
esac
# gen csr
openssl req -new -key "${key}" -subj "${subj}" -reqexts v3_ext \
-config <(printf "%b" "${csr_conf}") \
-out "${csr}" >/dev/null 2>&1
# gen cert
openssl x509 -in "${csr}" -req -CA "${ca_cert}" -CAkey "${ca_key}" -CAcreateserial -extensions v3_ext \
-extfile <(printf "%b" "${csr_conf}") \
-days "${cert_days}" -out "${cert}" >/dev/null 2>&1
rm -f "${csr}"
}
cert::update_kubeconf() {
local cert_name=${1}
local kubeconf_file=${cert_name}.conf
local cert=${cert_name}.crt
local key=${cert_name}.key
local subj
local cert_base64
check_file "${kubeconf_file}"
# get the key from the old kubeconf
grep "client-key-data" "${kubeconf_file}" | awk '{print$2}' | base64 -d >"${key}"
# get the old certificate from the old kubeconf
grep "client-certificate-data" "${kubeconf_file}" | awk '{print$2}' | base64 -d >"${cert}"
# get subject from the old certificate
subj=$(cert::get_subj "${cert_name}")
cert::gen_cert "${cert_name}" "client" "${subj}" "${CERT_DAYS}" "${CERT_CA}"
# get certificate base64 code
cert_base64=$(base64 -w 0 "${cert}")
# set certificate base64 code to kubeconf
sed -i 's/client-certificate-data:.*/client-certificate-data: '"${cert_base64}"'/g' "${kubeconf_file}"
rm -f "${cert}"
rm -f "${key}"
}
cert::update_etcd_cert() {
local subj
local subject_alt_name
local cert
# generate etcd server,peer certificate
# /etc/kubernetes/pki/etcd/server
# /etc/kubernetes/pki/etcd/peer
for cert in ${ETCD_CERT_SERVER} ${ETCD_CERT_PEER}; do
subj=$(cert::get_subj "${cert}")
subject_alt_name=$(cert::get_subject_alt_name "${cert}")
cert::gen_cert "${cert}" "peer" "${subj}" "${CERT_DAYS}" "${ETCD_CERT_CA}" "${subject_alt_name}"
log::info "${GREEN}updated ${BLUE}${cert}.conf${NC}"
done
# generate etcd healthcheck-client,apiserver-etcd-client certificate
# /etc/kubernetes/pki/etcd/healthcheck-client
# /etc/kubernetes/pki/apiserver-etcd-client
for cert in ${ETCD_CERT_HEALTHCHECK_CLIENT} ${ETCD_CERT_APISERVER_ETCD_CLIENT}; do
subj=$(cert::get_subj "${cert}")
cert::gen_cert "${cert}" "client" "${subj}" "${CERT_DAYS}" "${ETCD_CERT_CA}"
log::info "${GREEN}updated ${BLUE}${cert}.conf${NC}"
done
# restart etcd
case $CRI in
"docker")
docker ps | awk '/k8s_etcd/{print$1}' | xargs -r -I '{}' docker restart {} >/dev/null 2>&1 || true
;;
"containerd")
crictl ps | awk '/etcd-/{print$(NF-1)}' | xargs -r -I '{}' crictl stopp {} >/dev/null 2>&1 || true
;;
esac
log::info "restarted etcd with ${CRI}"
}
cert::update_master_cert() {
local subj
local subject_alt_name
local conf
# generate apiserver server certificate
# /etc/kubernetes/pki/apiserver
subj=$(cert::get_subj "${CERT_APISERVER}")
subject_alt_name=$(cert::get_subject_alt_name "${CERT_APISERVER}")
cert::gen_cert "${CERT_APISERVER}" "server" "${subj}" "${CERT_DAYS}" "${CERT_CA}" "${subject_alt_name}"
log::info "${GREEN}updated ${BLUE}${CERT_APISERVER}.crt${NC}"
# generate apiserver-kubelet-client certificate
# /etc/kubernetes/pki/apiserver-kubelet-client
subj=$(cert::get_subj "${CERT_APISERVER_KUBELET_CLIENT}")
cert::gen_cert "${CERT_APISERVER_KUBELET_CLIENT}" "client" "${subj}" "${CERT_DAYS}" "${CERT_CA}"
log::info "${GREEN}updated ${BLUE}${CERT_APISERVER_KUBELET_CLIENT}.crt${NC}"
# generate kubeconf for controller-manager,scheduler and kubelet
# /etc/kubernetes/controller-manager,scheduler,admin,kubelet.conf,super_admin(added after k8s v1.30.)
if [ -f "${CONF_SUPER_ADMIN}.conf" ]; then
conf_list="${CONF_CONTROLLER_MANAGER} ${CONF_SCHEDULER} ${CONF_ADMIN} ${CONF_KUBELET} ${CONF_SUPER_ADMIN}"
else
conf_list="${CONF_CONTROLLER_MANAGER} ${CONF_SCHEDULER} ${CONF_ADMIN} ${CONF_KUBELET}"
fi
for conf in ${conf_list}; do
if [[ ${conf##*/} == "kubelet" ]]; then
# https://github.com/kubernetes/kubeadm/issues/1753
set +e
grep kubelet-client-current.pem /etc/kubernetes/kubelet.conf >/dev/null 2>&1
kubelet_cert_auto_update=$?
set -e
if [[ "$kubelet_cert_auto_update" == "0" ]]; then
log::info "does not need to update kubelet.conf"
continue
fi
fi
# update kubeconf
cert::update_kubeconf "${conf}"
log::info "${GREEN}updated ${BLUE}${conf}.conf${NC}"
# copy admin.conf to ${HOME}/.kube/config
if [[ ${conf##*/} == "admin" ]]; then
mkdir -p "${HOME}/.kube"
local config=${HOME}/.kube/config
local config_backup
config_backup=${HOME}/.kube/config.old-$(date +%Y%m%d)
if [[ -f ${config} ]] && [[ ! -f ${config_backup} ]]; then
cp -fp "${config}" "${config_backup}"
log::info "backup ${config} to ${config_backup}"
fi
cp -fp "${conf}.conf" "${HOME}/.kube/config"
log::info "copy the admin.conf to ${HOME}/.kube/config"
fi
done
# generate front-proxy-client certificate
# /etc/kubernetes/pki/front-proxy-client
subj=$(cert::get_subj "${FRONT_PROXY_CLIENT}")
cert::gen_cert "${FRONT_PROXY_CLIENT}" "client" "${subj}" "${CERT_DAYS}" "${FRONT_PROXY_CA}"
log::info "${GREEN}updated ${BLUE}${FRONT_PROXY_CLIENT}.crt${NC}"
# restart apiserver, controller-manager, scheduler and kubelet
for item in "apiserver" "controller-manager" "scheduler"; do
case $CRI in
"docker")
docker ps | awk '/k8s_kube-'${item}'/{print$1}' | xargs -r -I '{}' docker restart {} >/dev/null 2>&1 || true
;;
"containerd")
crictl ps | awk '/kube-'${item}'-/{print $(NF-1)}' | xargs -r -I '{}' crictl stopp {} >/dev/null 2>&1 || true
;;
esac
log::info "restarted ${item} with ${CRI}"
done
systemctl restart kubelet || true
log::info "restarted kubelet"
}
main() {
local node_type=$1
# read the options
ARGS=`getopt -o c: --long cri: -- "$@"`
eval set -- "$ARGS"
# extract options and their arguments into variables.
while true
do
case "$1" in
-c|--cri)
case "$2" in
"docker"|"containerd")
CRI=$2
shift 2
;;
*)
echo 'Unsupported cri. Valid options are "docker", "containerd".'
exit 1
;;
esac
;;
--)
shift
break
;;
*)
echo "Invalid arguments."
exit 1
;;
esac
done
CERT_DAYS=3650
KUBE_PATH=/etc/kubernetes
PKI_PATH=${KUBE_PATH}/pki
# master certificates path
# apiserver
CERT_CA=${PKI_PATH}/ca
CERT_APISERVER=${PKI_PATH}/apiserver
CERT_APISERVER_KUBELET_CLIENT=${PKI_PATH}/apiserver-kubelet-client
CONF_CONTROLLER_MANAGER=${KUBE_PATH}/controller-manager
CONF_SCHEDULER=${KUBE_PATH}/scheduler
CONF_ADMIN=${KUBE_PATH}/admin
CONF_SUPER_ADMIN=${KUBE_PATH}/super-admin
CONF_KUBELET=${KUBE_PATH}/kubelet
# front-proxy
FRONT_PROXY_CA=${PKI_PATH}/front-proxy-ca
FRONT_PROXY_CLIENT=${PKI_PATH}/front-proxy-client
# etcd certificates path
ETCD_CERT_CA=${PKI_PATH}/etcd/ca
ETCD_CERT_SERVER=${PKI_PATH}/etcd/server
ETCD_CERT_PEER=${PKI_PATH}/etcd/peer
ETCD_CERT_HEALTHCHECK_CLIENT=${PKI_PATH}/etcd/healthcheck-client
ETCD_CERT_APISERVER_ETCD_CLIENT=${PKI_PATH}/apiserver-etcd-client
case ${node_type} in
# etcd)
# # update etcd certificates
# cert::update_etcd_cert
# ;;
master)
# check certificates expiration
cert::check_master_certs_expiration
# backup $KUBE_PATH to $KUBE_PATH.old-$(date +%Y%m%d)
cert::backup_file "${KUBE_PATH}"
# update master certificates and kubeconf
log::info "${GREEN}updating...${NC}"
cert::update_master_cert
log::info "${GREEN}done!!!${NC}"
# check certificates expiration after certificates updated
cert::check_master_certs_expiration
;;
all)
# check certificates expiration
cert::check_all_expiration
# backup $KUBE_PATH to $KUBE_PATH.old-$(date +%Y%m%d)
cert::backup_file "${KUBE_PATH}"
# update etcd certificates
log::info "${GREEN}updating...${NC}"
cert::update_etcd_cert
# update master certificates and kubeconf
cert::update_master_cert
log::info "${GREEN}done!!!${NC}"
# check certificates expiration after certificates updated
cert::check_all_expiration
;;
check)
# check certificates expiration
cert::check_all_expiration
;;
*)
log::err "unknown, unsupported cert type: ${node_type}, supported type: \"all\", \"master\""
printf "Documentation: https://github.com/yuyicai/update-kube-cert
example:
'\033[32m./update-kubeadm-cert.sh all\033[0m' update all etcd certificates, master certificates and kubeconf
/etc/kubernetes
├── admin.conf
├── super-admin.conf
├── controller-manager.conf
├── scheduler.conf
├── kubelet.conf
└── pki
├── apiserver.crt
├── apiserver-etcd-client.crt
├── apiserver-kubelet-client.crt
├── front-proxy-client.crt
└── etcd
├── healthcheck-client.crt
├── peer.crt
└── server.crt
'\033[32m./update-kubeadm-cert.sh master\033[0m' update only master certificates and kubeconf
/etc/kubernetes
├── admin.conf
├── super-admin.conf
├── controller-manager.conf
├── scheduler.conf
├── kubelet.conf
└── pki
├── apiserver.crt
├── apiserver-kubelet-client.crt
└── front-proxy-client.crt
"
exit 1
;;
esac
}
main "$@"
脚本需要在所有的 master 节点执行
bash update-kubeadm-cert.sh all --cri containerd
可以检查一下证书的到期时间
kubeadm certs check-expiration
可以看到,都是十年后到期了
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Sep 13, 2034 13:01 UTC 9y ca no
apiserver Sep 13, 2034 13:01 UTC 9y ca no
apiserver-etcd-client Sep 13, 2034 13:01 UTC 9y etcd-ca no
apiserver-kubelet-client Sep 13, 2034 13:01 UTC 9y ca no
controller-manager.conf Sep 13, 2034 13:01 UTC 9y ca no
etcd-healthcheck-client Sep 13, 2034 13:01 UTC 9y etcd-ca no
etcd-peer Sep 13, 2034 13:01 UTC 9y etcd-ca no
etcd-server Sep 13, 2034 13:01 UTC 9y etcd-ca no
front-proxy-client Sep 13, 2034 13:01 UTC 9y front-proxy-ca no
scheduler.conf Sep 13, 2034 13:01 UTC 9y ca no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Sep 13, 2034 11:08 UTC 9y no
etcd-ca Sep 13, 2034 11:08 UTC 9y no
front-proxy-ca Sep 13, 2034 11:08 UTC 9y no
到这里,整个高可用的 k8s 集群就部署完成了
模拟节点故障
通过
ip a
命令看看哪个节点有 vip 存在
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:08:f7:18 brd ff:ff:ff:ff:ff:ff
inet 192.168.22.113/24 brd 192.168.22.255 scope global noprefixroute ens3
valid_lft forever preferred_lft forever
inet 192.168.22.200/32 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe08:f718/64 scope link
valid_lft forever preferred_lft forever
我直接把这个节点关机来模拟节点故障,我们去另一个节点,查看 vip 是不是来了
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:88:e7:0d brd ff:ff:ff:ff:ff:ff
inet 192.168.22.112/24 brd 192.168.22.255 scope global noprefixroute ens3
valid_lft forever preferred_lft forever
inet 192.168.22.200/32 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe88:e70d/64 scope link
valid_lft forever preferred_lft forever
查看节点状态,113 节点现在是 notready 的,因为被我关机了,也是可以正常获取节点信息的
NAME STATUS ROLES AGE VERSION
192.168.22.111 Ready control-plane 115m v1.28.2
192.168.22.112 Ready control-plane 110m v1.28.2
192.168.22.113 NotReady control-plane 108m v1.28.2
192.168.22.114 Ready <none> 3m49s v1.28.2
192.168.22.115 Ready <none> 3m38s v1.28.2
ABC 三类地址总结
类别 | 地址范围 | 默认子网掩码 | 网络位/主机位 | 可用 IP 数量 | 私有地址范围 |
---|---|---|---|---|---|
A 类 | 1.0.0.0 到 126.255.255.255 | 255.0.0.0 | 8 位 / 24 位 | 16,777,214 | 10.0.0.0 到 10.255.255.255 |
B 类 | 128.0.0.0 到 191.255.255.255 | 255.255.0.0 | 16 位 / 16 位 | 65,534 | 172.16.0.0 到 172.31.255.255 |
C 类 | 192.0.0.0 到 223.255.255.255 | 255.255.255.0 | 24 位 / 8 位 | 254 | 192.168.0.0 到 192.168.255.255 |