基于K8S的微服务:一、服务发现,负载均衡测试(附calico网络问题解决)
1、需求
使用K8S部署server1,server2两个服务,server2调用server1的/test1接口,需支持负载均衡
2、实现两个服务用于测试
pyserver1.py
from flask import Flask, jsonify
app = Flask(__name__)
@app.route('/test1', methods=['GET'])
def test1():
return jsonify({"message": "This is Server1!"})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5001)
pyserver2.py
import requests
from flask import Flask, jsonify
app = Flask(__name__)
SERVER1_URL = "http://server1-service.default.svc.cluster.local/test1"
@app.route('/test2', methods=['GET'])
def test2():
try:
response = requests.get(SERVER1_URL)
response.raise_for_status()
return jsonify(response.json()), response.status_code
except requests.RequestException as e:
return jsonify({"error": str(e)}), 500
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5002)
注意API地址http://server1-service.default.svc.cluster.local:80/test1
与后续的server1-delpoyment-service.yaml
的service配置对应:
3、制作docker镜像
- export DOCKER_BUILDKIT=0 #解决某些版本如docker-ce:18.05构建镜像缓慢问题
- 基础镜像使用阿里云私有仓库拉的一个python镜像,先从阿里云容器镜像服务web页面搜索到这个镜像再docker pull,拉取前需要docker login一下,账号密码同等web页面的,需要配置该账号的权限,都登录了那就不需配置/etc/docker/daemon.json的insecure-registries,registry-mirrors。
- server2镜像的制作只需要“1”改为“2”即可
FROM alibaba-cloud-linux-3-registry.cn-hangzhou.cr.aliyuncs.com/alinux3/python:3.11.1
WORKDIR /app
RUN pip install --no-cache-dir requests flask -i http://mirrors.aliyun.com/pypi/simple --trusted-host mirrors.aliyun.com
COPY ./pyserver1.py .
CMD ["python", "pyserver1.py"]
执行构建:
docker build -f Dockerfile-server1 -t server1 .
docker build -f Dockerfile-server2 -t server2 .
4、编写服务部署的yaml文件
server1-deployment-service.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: server1
spec:
replicas: 2
selector:
matchLabels:
app: label-server1
template:
metadata:
labels:
app: label-server1
spec:
containers:
- name: server1-instance
image: server1:latest
imagePullPolicy: Never
ports:
- containerPort: 5001
---
apiVersion: v1
kind: Service
metadata:
name: server1-service
namespace: default
spec:
selector:
app: label-server1
ports:
- protocol: TCP
port: 80
targetPort: 5001
type: LoadBalancer
server2-deployment-service.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: server2
spec:
replicas: 2
selector:
matchLabels:
app: label-server2
template:
metadata:
labels:
app: label-server2
spec:
containers:
- name: server2-instance
image: server2:latest
imagePullPolicy: Never
ports:
- containerPort: 5002
---
apiVersion: v1
kind: Service
metadata:
name: server2-service
namespace: default
spec:
selector:
app: label-server2
ports:
- protocol: TCP
port: 80
targetPort: 5002
type: LoadBalancer
去除master的不可调度污点让server1,server2可以部署一个实例到master:kubectl taint nodes 128master node-role.kubernetes.io/master:NoSchedule-
部署应用并暴露服务:
kubectl apply -f server1-deployment-service.yaml
kubectl apply -f server2-deployment-service.yaml
5、接口调用测试
请求server2的/test2接口,该接口调用了server1的/test1接口
看看负载均衡效果,如果不需要复杂的负载均衡策略,基于K8S实现微服务还是可以的。
附:K8S calico网络问题
在测试接口调用主要出现了两个网络问题。
我的K8S集群初始化命令
kubeadm init --apiserver-advertise-address=192.168.72.128 --image-repository registry.aliyuncs.com/google_containers
calico 直接apply的官方找的原文件。如果kubeadm init时指定了--pod-network-cidr
,这时caclio的yamlCALICO_IPV4POOL_CIDR
配置项也对应起来即可。我的K8S之前安装好后,过了一段时间没用,coredns组件异常或没有ready了。这时curl localhost:30309/test2
会报错请求http://server1-service.default.svc.cluster.local:80/test1
时域名解析不了。
[root@128aworker ~]# kubectl get all -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system pod/calico-kube-controllers-577f77cb5c-94cxn 1/1 Running 0 2m21s 10.244.35.1 127amaster <none> <none>
kube-system pod/calico-node-89pdk 0/1 CrashLoopBackOff 4 2m21s 192.168.72.128 128aworker <none> <none>
kube-system pod/calico-node-bt4qx 1/1 Running 0 2m21s 192.168.72.127 127amaster <none> <none>
kube-system pod/coredns-6d56c8448f-kfcx9 1/1 Running 0 6m26s 10.244.55.64 128aworker <none> <none>
kube-system pod/coredns-6d56c8448f-zlbcb 1/1 Running 0 6m26s 10.244.55.65 128aworker <none> <none>
kube-system pod/etcd-128aworker 1/1 Running 0 6m40s 192.168.72.128 128aworker <none> <none>
kube-system pod/kube-apiserver-128aworker 1/1 Running 0 6m40s 192.168.72.128 128aworker <none> <none>
kube-system pod/kube-controller-manager-128aworker 1/1 Running 0 6m40s 192.168.72.128 128aworker <none> <none>
kube-system pod/kube-proxy-dk4mj 1/1 Running 0 6m26s 192.168.72.128 128aworker <none> <none>
kube-system pod/kube-proxy-n8cnw 1/1 Running 0 6m14s 192.168.72.127 127amaster <none> <none>
kube-system pod/kube-scheduler-128aworker 1/1 Running 0 6m40s 192.168.72.128 128aworker <none> <none>
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6m43s <none>
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 6m42s k8s-app=kube-dns
master worker都kubeadm reset -f
重置后重新建立集群coredns OK了,但caclio还没有ready
[root@128aworker ~]# kubectl get all -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system pod/calico-kube-controllers-577f77cb5c-94cxn 1/1 Running 0 8m29s 10.244.35.1 127amaster <none> <none>
kube-system pod/calico-node-89pdk 0/1 Running 6 8m29s 192.168.72.128 128aworker <none> <none>
kube-system pod/calico-node-bt4qx 0/1 Running 0 8m29s 192.168.72.127 127amaster <none> <none>
kube-system pod/coredns-6d56c8448f-kfcx9 1/1 Running 0 12m 10.244.55.64 128aworker <none> <none>
kube-system pod/coredns-6d56c8448f-zlbcb 1/1 Running 0 12m 10.244.55.65 128aworker <none> <none>
kube-system pod/etcd-128aworker 1/1 Running 0 12m 192.168.72.128 128aworker <none> <none>
kube-system pod/kube-apiserver-128aworker 1/1 Running 0 12m 192.168.72.128 128aworker <none> <none>
kube-system pod/kube-controller-manager-128aworker 1/1 Running 0 12m 192.168.72.128 128aworker <none> <none>
kube-system pod/kube-proxy-dk4mj 1/1 Running 0 12m 192.168.72.128 128aworker <none> <none>
kube-system pod/kube-proxy-n8cnw 1/1 Running 0 12m 192.168.72.127 127amaster <none> <none>
kube-system pod/kube-scheduler-128aworker 1/1 Running 0 12m 192.168.72.128 128aworker <none> <none>
再次测试发现http://server1-service.default.svc.cluster.local:80/test1
网络不通,logs/describe pod都找不到明显的错误解决思路,其实这是docker自定义host_elastic bridge网络污染了K8S的网络,或许高版本K8S/calico已经解决了该问题
calico issue 相关问题还比较多。。。
[root@128aworker ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
b11900aaa7fb bridge bridge local
d2685ac3c5a5 home_elastic bridge local
d00ae1cde2a2 host host local
47c26a5dffe2 none null local
[root@128aworker ~]# docker network rm d26
删除所有节点docker自定义网络并kubectl delete -f calico.yaml
再重新apply即可
[root@128aworker home]# kubectl get all -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system pod/calico-kube-controllers-577f77cb5c-bgrxd 1/1 Running 0 49s 10.244.35.1 127amaster <none> <none>
kube-system pod/calico-node-fbcxz 1/1 Running 0 49s 192.168.72.128 128aworker <none> <none>
kube-system pod/calico-node-jq2kc 1/1 Running 0 49s 192.168.72.127 127amaster <none> <none>
kube-system pod/coredns-6d56c8448f-qxfhg 1/1 Running 0 2m10s 10.244.55.66 128aworker <none> <none>
kube-system pod/coredns-6d56c8448f-zvrj8 1/1 Running 0 2m10s 10.244.55.65 128aworker <none> <none>
kube-system pod/etcd-128aworker 1/1 Running 0 2m24s 192.168.72.128 128aworker <none> <none>
kube-system pod/kube-apiserver-128aworker 1/1 Running 0 2m24s 192.168.72.128 128aworker <none> <none>
kube-system pod/kube-controller-manager-128aworker 1/1 Running 0 2m24s 192.168.72.128 128aworker <none> <none>
kube-system pod/kube-proxy-2w87z 1/1 Running 0 106s 192.168.72.127 127amaster <none> <none>
kube-system pod/kube-proxy-tdf7k 1/1 Running 0 2m10s 192.168.72.128 128aworker <none> <none>
kube-system pod/kube-scheduler-128aworker 1/1 Running 0 2m24s 192.168.72.128 128aworker <none> <none>
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default service/kubernetes ClusterIP 10.1.0.1 <none> 443/TCP 2m27s <none>
kube-system service/kube-dns ClusterIP 10.1.0.10 <none> 53/UDP,53/TCP,9153/TCP 2m25s k8s-app=kube-dns