基于 kubeadm 安装 Kubernetes 集群的完整步骤
准备工作
- 准备好3台服务器 ,每台机器内存2GB以上, master节点cpu 2C以上。
- 同步服务器时间(NTP)
- 设置服务器ssh免密配置
安装步骤
设置hosts
vi /etc/hosts
192.168.0.110 master110
192.168.0.109 node109
192.168.0.112 node112
关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
systemctl status firewalld
关闭swap
关闭swap (为保证系统的稳定性和性能,k8s建议关闭swap)
临时关闭:swapoff -a
永久关闭(需要重启机子)
vi etc/fstab
[root@master110 ~]# less /etc/fstab
#
# /etc/fstab
# Created by anaconda on Wed Mar 27 06:37:15 2024
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/centos-root / xfs defaults 0 0
UUID=ae1ef289-9c30-4338-af1f-6be0af376392 /boot xfs defaults 0 0
#/dev/mapper/centos-swap swap swap defaults 0 0
禁用SELinux
vi /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
#SELINUX=enforcing
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
#SELINUXTYPE=targeted
SELINUX=disabled
设置iptables
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sysctl --system
安装容器运行时 containerd
yum install -y yum-utils
# 添加docker镜像源
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 安装
yum install -y containerd
# 启动containerd
sudo systemctl start containerd
# 设置开机启动
sudo systemctl enable containerd
安装kubeadm,kubelet,kubectl
- kubeadm:用来初始化集群的指令。
- kubelet:在集群中的每个节点用来启动Pod和容器。
- kubectl:用来与集群通信的命令行工具。
添加kubernetes yum 源
vi /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
安装kubeadm,kubelet,kubeadm
sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
sudo systemctl enable --now kubelet
- –disableexcludes=kubernetes 禁用在 yum 配置文件或特定仓库中与 kubernetes 相关的 exclude 规则,目的是确保可以从 Kubernetes 官方仓库中安装指定的软件包,而不会被系统配置的 exclude 条目阻止。
- enable: 开机自动启动
- –now: 同时启用并立即启动 kubelet 服务
使用kubeadm init 初始化控制面
kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.28.0 --apiserver-advertise-address=192.168.0.110 --pod-network-cidr=10.244.0.0/16
- –image-repository 指定kubernetes镜像仓库地址
- – kuberneets-version 指定安装的k8s版本
- –apiserver-advertise-address 指定API服务器对外暴露的地址
- –pod-network-cidr 指定Pod网络范围的CIDR范围。
设置kubectl 配置文件
执行完kubeadmin init后,配置kubectl执行环境,根据提示执行命令,将/etc/kubernetes/admin.conf复制到$HOME/.kube/config下。
admin.conf文件包含 Kubernetes 集群的认证和配置数据,kubectl 会使用该文件连接到 Kubernetes API 服务器。
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.0.110:6443 --token e7k2lc.xle4heu4e8u9wf51 \
--discovery-token-ca-cert-hash sha256:76d7d0e07423a26fa42168a56f739243a7fcb1af70255e1d01d76fa60b1e9f01
工作节点加入集群
上面master节点执行完kubeadm init后,需要在工作节点上执行上面kubeadm join操作,加入K8s集群。
如果kubeadm join一直卡住不动,可能是token失效了,可以加–v=5 查看日志
需要在master节点重新生成token,命令如下
[root@master110 modules-load.d]# kubeadm token create --print-join-command
kubeadm join 192.168.0.110:6443 --token q5r3qz.yphz3qvltweoiqtx --discovery-token-ca-cert-hash sha256:76d7d0e07423a26fa42168a56f739243a7fcb1af70255e1d01d76fa60b1e9f01
在node节点重新执行上面的命令即可。
安装flannel插件
上面步骤执行完后,执行kubectl get node,会发现节点还处于Not Ready状态
要使k8s 容器间能够相互通信,还需要安装网络插件,Flannel 是 Kubernetes(K8s)中常用的网络插件之一,主要用于提供容器间的网络连接。
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
修改net-conf.json部分
net-conf.json: |
{
"Network": "10.244.0.0/16",
"EnableNFTables": false,
"Backend": {
"Type": "vxlan"
}
}
执行部署命令
kubectl apply -f kube-flannel.yml
Q&A
问题一:执行kubeadm init报错
# failed to get sandbox image \\\“registry.k8s.io/pause:3.6\\\“: failed to pull image \\\“registry.k8
解决:
执行 containerd config default > /etc/containerd/config.toml
(注意:不能直接vi config.toml,默认的config.toml可能不完整)
然后在这个文件里面,找到sandbox_image = “registry.k8s.io/pause:3.6”
改为sandbox_image = “registry.aliyuncs.com/google_containers/pause:3.9”
最后重启 systemctl restart containerd
[root@master128 ~]# containerd config default > /etc/containerd/config.toml
[root@master128 ~]# vi /etc/containerd/config.toml
[root@master128 ~]# systemctl restart containerd
问题二:执行时,kubectl apply -f kube-flannel.yml,用kubectl get pod -A发现kube-flannel-ds 镜像拉取失败 ImagePullBackOff状态。
解决:
通过kubectl describe 命令可以发现镜像下载失败了,需要配置containerd加速器,配置方法可参考阿里云:
https://help.aliyun.com/zh/acr/user-guide/accelerate-the-pulls-of-docker-official-images?spm=5176.21213303.J_qCOwPWspKEuWcmp8qiZNQ.1.ef5c2f3dxsPpHs&scm=20140722.S_help@@%E6%96%87%E6%A1%A3@@60750..ID_help@@%E6%96%87%E6%A1%A3@@60750-RL%E5%8A%A0%E9%80%9F%E5%99%A8%E5%9C%B0%E5%9D%80-LOC_llm-OR_ser-V_4-RE_new5-P0_0#4766fe99e4g5f
Containerd在启动时指定默认的配置文件路径为/etc/containerd/config.toml
,后续所有镜像仓库相关的配置可以在该文件中进行热加载,无需重启Containerd。如果您使用不同的路径,可以根据实际情况进行调整。
-
确保配置文件包含
config_path
。执行以下命令,确认默认配置文件中是否存在
config_path
配置(例如"/etc/containerd/cert.d"
)。cat /etc/containerd/config.toml |grep config_path -C 5
如果不存在,您可以在默认配置文件中添加以下配置。
说明
若已有
[plugins."io.containerd.grpc.v1.cri".registry]
,则在下面只需添加config_path
,注意缩进。若没有,则可以在任意地方写入。[plugins."io.containerd.grpc.v1.cri".registry] config_path = "/etc/containerd/certs.d"
-
检查
mirror
相关配置。检查
/etc/containerd/config.toml
配置文件中是否存在与mirror
相关的配置。如果已存在,请进行清理,以避免冲突。[plugins."io.containerd.grpc.v1.cri".registry.mirrors] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"] endpoint = ["https://registry-1.docker.io"]
-
重启
containerd
(可选)。如果默认配置文件发生更改,则可执行以下命令,重启Containerd。
systemctl restart containerd
若启动失败,执行以下命令。检查失败原因,通常是配置文件仍有冲突导致,您可以依据报错做相应调整。
journalctl -u containerd
-
创建镜像加速器配置文件。
使用以下内容在指定的config_path路径下创建docker.io/hosts.toml文件,创建后的文件路径为
/etc/containerd/certs.d/docker.io/hosts.toml
。server = "https://registry-1.docker.io" [host."$(镜像加速器地址,如https://docker.unsee.tech)"] capabilities = ["pull", "resolve", "push"]
本次安装使用的加速器地址:
server = "https://registry-1.docker.io" [host."https://docker.unsee.tech"] capabilities = ["pull", "resolve", "push"]
验证集群状态
# 所有节点都处于Ready状态
[root@master docker.io]# kubectl get node
NAME STATUS ROLES AGE VERSION
data Ready <none> 40h v1.28.2
master Ready control-plane 40h v1.28.2
# 快速启动一个nginx pod
kubectl run my-nginx --image=nginx --port=80
# 查看pod运行状态
[root@master docker.io]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-nginx 1/1 Running 0 20m 10.244.1.6 data <none> <none>
# 访问nginx成功
[root@master docker.io]# curl http://10.244.1.6
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>