虚拟机安装k8s集群
环境准备
- 主节点(Master Node):
- IP地址: 192.168.40.100
- 主机名: k8s-master
- 工作节点(Worker Node):
- IP地址: 192.168.40.101
- 主机名: k8s-node1
步骤 1: 配置虚拟机环境
1.1 设置主机名
在每台虚拟机上设置唯一的主机名:
#在主节点上执行
sudo hostnamectl set-hostname k8s-master
# 在工作节点上执行
sudo hostnamectl set-hostname k8s-node1
1.2 配置hosts文件
编辑/etc/hosts文件,添加所有节点的IP地址和主机名映射:
sudo vi /etc/hosts
添加如下内容:
192.168.40.100 k8s-master
192.168.40.101 k8s-node1
1.3 更新系统并安装依赖
sudo yum update -y
sudo yum install -y curl vim wget git
1.4 关闭防火墙和 SELinux
sudo systemctl stop firewalld
sudo systemctl disable firewalld
sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing/SELINUX=permissive/' /etc/selinux/config
1.5 禁用交换分区(Kubernetes 要求)
sudo swapoff -a
sudo sed -i '/swap/s/^\(.*\)$/#\1/g' /etc/fstab
1.6 配置网络转发和桥接
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
步骤 2: 安装Containerd
2.1 安装Containerd
下载并安装Containerd:
# 下载 Containerd
wget https://github.com/containerd/containerd/releases/download/v1.7.0/containerd-1.7.0-linux-amd64.tar.gz
# 解压文件
sudo tar Cxzvf /usr/local containerd-1.7.0-linux-amd64.tar.gz
# 创建 systemd 服务文件
sudo mkdir -p /usr/local/lib/systemd/system
sudo curl -o /usr/local/lib/systemd/system/containerd.service https://raw.githubusercontent.com/containerd/containerd/main/containerd.service
# 重新加载 systemd 配置
sudo systemctl daemon-reload
# 启动并启用 Containerd
sudo systemctl enable containerd
sudo systemctl start containerd
containerd --version
2.2 配置 Containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
2.3 安装 runc
runc 是容器技术的核心组件之一,它负责根据 OCI(Open Container Initiative)规范创建和运行容器
Docker、containerd 等均依赖 runc
为什么要安装替换runc?
因为原本的runc有问题,不能使用。需要重新安装并替换。
# 1、安装 libseccomp
# 下载 libseccomp 源码包
wget https://github.com/seccomp/libseccomp/releases/download/v2.5.4/libseccomp-2.5.4.tar.gz
# 解压文件
tar xf libseccomp-2.5.4.tar.gz
# 进入解压后的目录
cd libseccomp-2.5.4
# 安装依赖
sudo yum install gperf -y
# 配置、编译并安装
./configure
make
sudo make install
# 查找已安装的 libseccomp.so 文件
sudo find / -name "libseccomp.so"
#2、安装runc
#下载并赋予执行权限
wget https://github.com/opencontainers/runc/releases/download/v1.1.5/runc.amd64
chmod +x runc.amd64
#查找现有 runc 的位置
which runc
备份现有的 runc 文件
#sudo mv /usr/local/sbin/runc /usr/local/sbin/runc.bak
#移动新版本的 runc 到目标位置
sudo mv runc.amd64 /usr/local/sbin/runc
2.4 启动并启用 Containerd
sudo systemctl restart containerd
sudo systemctl enable containerd
步骤 3: 安装Kubernetes组件
3.1 添加Kubernetes YUM仓库
# 创建 Kubernetes YUM 仓库文件
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
EOF
3.2 安装kubeadm, kubelet, 和 kubectl
sudo yum install -y kubelet-1.28.2 kubeadm-1.28.2 kubectl-1.28.2 --disableexcludes=kubernetes
3.3启动并启用 kubelet
sudo systemctl enable --now kubelet
步骤 4: 初始化集群
4.1 在主节点上初始化集群
在主节点(k8s-master)上运行以下命令来初始化Kubernetes集群:
192.168.0.0/16 是 calico 网络插件的默认ip池,如果要用calico网络插件这个地方需要写成 192.168.0.0/16
ps:理论上这个地方的ip段可以任意写(只要不和虚拟主机的网段重复就可以),但是必须和calico网络插件的配置保持一致。对于高手来讲也可以去改calico配置,然后两边保持一致就好。我是菜鸟,就先不改了。
如果使用其他的网络插件,这个地方可能不一样,要根据具体网络插件来写。
kubeadm reset
sudo kubeadm init --pod-network-cidr=192.168.0.0/16
如果成功,你会看到类似如下的输出:
一般情况下,如果上述服务器配置和各组件的版本对的话,这一步应该都能成功
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.40.100:6443 --token d2j2ew.qmgrhops9sw5jubp \
--discovery-token-ca-cert-hash sha256:9a2b7e5427e94c5453bc498c7ef35362fda97e6719f1fefcef72f52db2297b07
4.2 配置kubectl
按照输出中的指示,配置kubectl:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
4.3 验证
查看nodes
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane 102s v1.28.2
查看pods
[root@k8s-master ~]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-5dd5756b68-cx54x 0/1 ContainerCreating 0 16s
kube-system coredns-5dd5756b68-tp4qc 0/1 ContainerCreating 0 16s
kube-system etcd-k8s-master 1/1 Running 6 28s
kube-system kube-apiserver-k8s-master 1/1 Running 6 31s
kube-system kube-controller-manager-k8s-master 1/1 Running 6 28s
kube-system kube-proxy-9cwrj 1/1 Running 0 16s
kube-system kube-scheduler-k8s-master 1/1 Running 6 28s
步骤 5: 配置Pod网络
选择并部署一个Pod网络插件。这里以calico为例:
这个yml地址需要链接外网VPN,因为我在国内没找到合适的版本地址
kubectl apply -f https://docs.projectcalico.org/v3.25/manifests/calico.yaml
再次查看pods 多了 calico-kube-controllers 和 calico-node 并且 coredns 也启动起来了
[root@k8s-master ~]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-658d97c59c-c84l7 1/1 Running 0 25s
kube-system calico-node-cnzm9 1/1 Running 0 25s
kube-system coredns-5dd5756b68-cx54x 1/1 Running 0 3m9s
kube-system coredns-5dd5756b68-tp4qc 1/1 Running 0 3m9s
kube-system etcd-k8s-master 1/1 Running 6 3m21s
kube-system kube-apiserver-k8s-master 1/1 Running 6 3m24s
kube-system kube-controller-manager-k8s-master 1/1 Running 6 3m21s
kube-system kube-proxy-9cwrj 1/1 Running 0 3m9s
kube-system kube-scheduler-k8s-master 1/1 Running 6 3m21s
步骤 6: 加入工作节点
6.1 安装操作
在工作节点(k8s-node1)上运行加入命令。使用从主节点初始化时生成的kubeadm join命令加入集群。例如:
kubeadm join 192.168.40.100:6443 --token d2j2ew.qmgrhops9sw5jubp \
--discovery-token-ca-cert-hash sha256:9a2b7e5427e94c5453bc498c7ef35362fda97e6719f1fefcef72f52db2297b07
如果出现以下信息说明woker 节点加入成功
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
再次查看 nodes 和pods
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane 7m3s v1.28.2
k8s-node1 Ready <none> 57s v1.28.2
[root@k8s-master ~]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-658d97c59c-c84l7 1/1 Running 0 4m40s
kube-system calico-node-cnzm9 1/1 Running 0 4m40s
kube-system calico-node-hx56t 1/1 Running 0 94s
kube-system coredns-5dd5756b68-cx54x 1/1 Running 0 7m24s
kube-system coredns-5dd5756b68-tp4qc 1/1 Running 0 7m24s
kube-system etcd-k8s-master 1/1 Running 6 7m36s
kube-system kube-apiserver-k8s-master 1/1 Running 6 7m39s
kube-system kube-controller-manager-k8s-master 1/1 Running 6 7m36s
kube-system kube-proxy-9cwrj 1/1 Running 0 7m24s
kube-system kube-proxy-rkf7v 1/1 Running 0 94s
kube-system kube-scheduler-k8s-master 1/1 Running 6 7m36s
6.2 问题解决
我在操作join时,worker节点始终无法加入集群(卡到下面这个地方不动)
[root@k8s-node1 /]# kubeadm join 192.168.40.100:6443 --token ovg3gc.aaoioxvzi495tw9k \
> --discovery-token-ca-cert-hash sha256:b803cb2421fc6ea3f02669c0f86a031607aee6ac7bc8a466edf247aa3960a314
[preflight] Running pre-flight checks
查看日志发现 mastor节点的 6443 端口 和 179 端口从 worker节点访问不到。
mastor节点的 6443 端口和179端口从 mastor节点能访问到
mastor节点的ip和域名从 worker节点可以ping通
在mastor节点执行以下命令解决
sudo iptables -I INPUT 1 -p tcp --dport 179 -j ACCEPT
sudo iptables -I INPUT 1 -p tcp --dport 6443 -j ACCEPT
为了使上述配置在服务器重启后仍然生效, 需要将 iptables 规则 持久化
# 安装 iptables-services
sudo yum install iptables-services -y
# 添加规则
sudo iptables -I INPUT 1 -p tcp --dport 179 -j ACCEPT
sudo iptables -I INPUT 1 -p tcp --dport 6443 -j ACCEPT
# 保存规则
sudo service iptables save
# 或者
sudo iptables-save | sudo tee /etc/sysconfig/iptables
# 启动并启用 iptables 服务
sudo systemctl start iptables
sudo systemctl enable iptables
# 检查状态
sudo systemctl status iptables
sudo iptables -L -v
如果发现重启后报
Unable to connect to the server: tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
请执行以下命令
# 删除旧的 .kube 目录
rm -rf ~/.kube/
# 创建新的 .kube 目录并复制 admin.conf
mkdir ~/.kube
cp /etc/kubernetes/admin.conf ~/.kube/
# 进入 .kube 目录并重命名 admin.conf 为 config
cd ~/.kube
mv admin.conf config
# 调整 .kube/config 文件的权限
chmod 600 ~/.kube/config
# 重启 kubelet 服务
sudo systemctl restart kubelet