K8S中的Pod生命周期之容器探测
容器探测
在Kubernetes中,确实存在两种类型的探针,用于确保容器实例的健康和可用性:
-
Liveness Probe(存活性探针):这种探针用于判断容器是否仍然处于“存活”状态。如果一个容器的存活性探针失败,Kubernetes会认为这个容器无法正常工作,因此会重启这个容器。存活性探针的典型用途是检测应用程序是否已经崩溃或者变得无响应。
-
Readiness Probe(就绪性探针):这种探针用于判断容器是否已经准备好接受流量。如果一个容器的就绪性探针失败,Kubernetes会从服务的负载均衡池中将该容器摘除,直到该容器再次报告它已经准备好。就绪性探针常用于确保新启动的容器在开始接收流量之前已经完成了初始化过程。
这两种探针都可以通过HTTP请求、TCP连接尝试或者执行容器内命令等方式来实现。它们可以配置为定期执行,并且可以设置初始延迟、超时时间以及探测间隔。
在配置时,可以为每个探针设置以下参数:
-
InitialDelaySeconds:在容器启动后等待多少秒才开始执行探针检查。
-
TimeoutSeconds:探针检查超时的时间。
-
PeriodSeconds:探针检查的执行频率。
-
SuccessThreshold:探测成功后,需要连续成功多少次才认为容器健康。
-
FailureThreshold:探测失败后,需要连续失败多少次才认为容器不健康。
Kubernetes 中存活性探针(Liveness Probe)和就绪性探针(Readiness Probe)支持的三种探测方式。以下是对这三种方式的简要说明:
-
Exec 命令探针:通过在容器内部执行一个命令来检查容器的健康状况。如果命令执行成功(即退出码为0),则认为容器是健康的。例如:
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
这个例子中,如果容器内 /tmp/healthy
文件存在,cat
命令将成功执行,探针将返回健康状态。
-
TCP Socket 探针:通过尝试与容器内部的某个端口建立 TCP 连接来检查容器的健康状况。如果连接成功建立,探针认为容器是健康的。例如:
livenessProbe:
tcpSocket:
port: 8080
在这个例子中,Kubernetes 将尝试连接到容器的 8080 端口,如果连接成功,容器被认为是健康的。
-
HTTP GET 探针:通过向容器内部的 Web 应用发送 HTTP GET 请求来检查容器的健康状况。如果 HTTP 响应的状态码在 200 到 399 之间,探针认为容器是健康的。例如:
livenessProbe:
httpGet:
path: /healthz # 这里是健康检查的URI路径
port: 80 # 这里是容器内部的端口号
host: 127.0.0.1 # 这里是主机地址,通常设置为localhost或127.0.0.1
scheme: HTTP # 这里指定使用的是HTTP协议
Exec命令探针
[root@K8s-master-01 ~]# vim pod-liveness-exec.yaml
---
apiVersion: v1
kind: Pod
metadata:
name: pod-liveness-exec
namespace: test
spec:
containers:
- name: nginx
image: nginx:1.17.1
ports:
- name: nginx-port
containerPort: 80
livenessProbe:
exec:
command: ["/bin/cat", "/tmp/hello.txt"]
测试:
[root@k8s-master ~]# kubectl apply -f pod-liveness-exec.yaml
pod/pod-liveness-exec created
[root@k8s-master ~]# kubectl get pods -n test
NAME READY STATUS RESTARTS AGE
pod-hook-exec 1/1 Running 0 5m54s
pod-liveness-exec 0/1 ContainerCreating 0 10s
#可以看的pod-liveness-exec一直在进行重启
[root@k8s-master ~]# kubectl get pods -n test -w
NAME READY STATUS RESTARTS AGE
pod-hook-exec 1/1 Running 0 5m58s
pod-liveness-exec 0/1 ContainerCreating 0 14s
pod-liveness-exec 1/1 Running 0 27s
pod-liveness-exec 1/1 Running 1 51s
pod-liveness-exec 1/1 Running 2 80s
^C[root@k8s-master ~]# kubectl get pods pod-liveness-exec -n test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-liveness-exec 1/1 Running 2 98s 10.244.36.71 k8s-node1 <none> <none>
# 观察上面的信息就会发现nginx容器启动之后就进行了健康检查
# 检查失败之后,容器被kill掉,然后尝试进行重启(这是重启策略的作用,后面讲解)
# 稍等一会之后,再观察pod信息,就可以看到RESTARTS不再是0,而是一直增长
[root@k8s-master ~]# kubectl describe pods pod-liveness-exec -n test
Name: pod-liveness-exec
Namespace: test
Priority: 0
Node: k8s-node1/192.168.58.232
Start Time: Sun, 05 Jan 2025 06:25:37 -0500
Labels: <none>
Annotations: cni.projectcalico.org/containerID: 2c63332a13c88208601c7bb06037d9ed48af8a41a6c2719a3bda89d151df4949
cni.projectcalico.org/podIP: 10.244.36.71/32
cni.projectcalico.org/podIPs: 10.244.36.71/32
Status: Running
IP: 10.244.36.71
IPs:
IP: 10.244.36.71
Containers:
nginx:
Container ID: docker://d481982dd1fb13ea1f2fb986c7b3da6c8ca9947de0451db9c1a63a17e91a8e2e
Image: nginx:1.17.1
Image ID: docker-pullable://nginx@sha256:b4b9b3eee194703fc2fa8afa5b7510c77ae70cfba567af1376a573a967c03dbb
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Sun, 05 Jan 2025 06:27:27 -0500
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Sun, 05 Jan 2025 06:26:57 -0500
Finished: Sun, 05 Jan 2025 06:27:27 -0500
Ready: True
Restart Count: 3
Liveness: exec [/bin/cat /tmp/hello.txt] delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-p9sl2 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-p9sl2:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 112s default-scheduler Successfully assigned test/pod-liveness-exec to k8s-node1
Normal Pulling 110s kubelet Pulling image "nginx:1.17.1"
Normal Pulled 88s kubelet Successfully pulled image "nginx:1.17.1" in 22.414198218s
Normal Created 2s (x4 over 87s) kubelet Created container nginx
Normal Started 2s (x4 over 86s) kubelet Started container nginx
Warning Unhealthy 2s (x9 over 82s) kubelet Liveness probe failed: /bin/cat: /tmp/hello.txt: No such file or directory
Normal Killing 2s (x3 over 62s) kubelet Container nginx failed liveness probe, will be restarted
Normal Pulled 2s (x3 over 62s) kubelet Container image "nginx:1.17.1" already present on machine
[root@k8s-master ~]# kubectl get pods -n test -w
NAME READY STATUS RESTARTS AGE
pod-hook-exec 1/1 Running 0 7m51s
pod-liveness-exec 1/1 Running 3 2m7s
pod-liveness-exec 1/1 Running 4 2m20s
^C[root@k8s-master ~]# kubectl exec pod-liveness-exec -n test -it -c nginx /bin/sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
# echo hello > /tmp/hello.txt
# ^C
#
#pod-liveness-exec不再进行重启
[root@k8s-master ~]# kubectl get pods -n test -w
NAME READY STATUS RESTARTS AGE
pod-hook-exec 1/1 Running 0 8m35s
pod-liveness-exec 1/1 Running 4 2m51s
TCP Socket 探针
[root@K8s-master-01 ~]# vim pod-liveness-tcpsocket.yaml
---
apiVersion: v1
kind: Pod
metadata:
name: pod-liveness-tcpsocket
namespace: test
spec:
containers:
- name: nginx
image: nginx:1.17.1
ports:
- name: nginx-port
containerPort: 80
livenessProbe:
tcpSocket:
port: 8080
[root@k8s-master ~]# kubectl apply -f pod-liveness-tcpsocket.yaml
pod/pod-liveness-tcpsocket created
[root@k8s-master ~]# kubectl get pods -n test
NAME READY STATUS RESTARTS AGE
pod-hook-exec 1/1 Running 0 19m
pod-liveness-exec 1/1 Running 4 13m
pod-liveness-tcpsocket 1/1 Running 0 13s
[root@k8s-master ~]# kubectl describe pods pod-liveness-tcpsocket -n test
Name: pod-liveness-tcpsocket
Namespace: test
Priority: 0
Node: k8s-node1/192.168.58.232
Start Time: Sun, 05 Jan 2025 06:39:04 -0500
Labels: <none>
Annotations: cni.projectcalico.org/containerID: 30d75d920042113f989377e26b0e82b69c3bd4d21172569040c5422087f1b57a
cni.projectcalico.org/podIP: 10.244.36.72/32
cni.projectcalico.org/podIPs: 10.244.36.72/32
Status: Running
IP: 10.244.36.72
IPs:
IP: 10.244.36.72
Containers:
nginx:
Container ID: docker://bdfd4e57e301d34d5828c41cc8e984340480cb0916fba774c2e0d33e7eef30ef
Image: nginx:1.17.1
Image ID: docker-pullable://nginx@sha256:b4b9b3eee194703fc2fa8afa5b7510c77ae70cfba567af1376a573a967c03dbb
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Sun, 05 Jan 2025 06:39:35 -0500
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Sun, 05 Jan 2025 06:39:07 -0500
Finished: Sun, 05 Jan 2025 06:39:34 -0500
Ready: True
Restart Count: 1
Liveness: tcp-socket :8080 delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-b8f5r (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-b8f5r:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 43s default-scheduler Successfully assigned test/pod-liveness-tcpsocket to k8s-node1
Normal Killing 13s kubelet Container nginx failed liveness probe, will be restarted
Normal Pulled 12s (x2 over 41s) kubelet Container image "nginx:1.17.1" already present on machine
Normal Created 12s (x2 over 41s) kubelet Created container nginx
Normal Started 12s (x2 over 40s) kubelet Started container nginx
Warning Unhealthy 3s (x4 over 33s) kubelet Liveness probe failed: dial tcp 10.244.36.72:8080: connect: connection refused
[root@k8s-master ~]# sed -i 's/8080/80/' pod-liveness-tcpsocket.yaml
[root@k8s-master ~]# kubectl delete -f pod-liveness-tcpsocket.yaml
pod "pod-liveness-tcpsocket" deleted
[root@k8s-master ~]# kubectl apply -f pod-liveness-tcpsocket.yaml
pod/pod-liveness-tcpsocket created
[root@k8s-master ~]# kubectl describe pods pod-liveness-tcpsocket -n test
Name: pod-liveness-tcpsocket
Namespace: test
Priority: 0
Node: k8s-node1/192.168.58.232
Start Time: Sun, 05 Jan 2025 06:40:47 -0500
Labels: <none>
Annotations: cni.projectcalico.org/containerID: e7734192c4d09f09204b4178e12466433cbb6b146256b04bd6dae049765c0e70
cni.projectcalico.org/podIP: 10.244.36.73/32
cni.projectcalico.org/podIPs: 10.244.36.73/32
Status: Running
IP: 10.244.36.73
IPs:
IP: 10.244.36.73
Containers:
nginx:
Container ID: docker://883a59fe8538dd3a8043cb230edc5a6f1e5b57ee9b47736fa8a6c61a49bc5a7b
Image: nginx:1.17.1
Image ID: docker-pullable://nginx@sha256:b4b9b3eee194703fc2fa8afa5b7510c77ae70cfba567af1376a573a967c03dbb
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Sun, 05 Jan 2025 06:40:49 -0500
Ready: True
Restart Count: 0
Liveness: tcp-socket :80 delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rqkqt (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-rqkqt:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4s default-scheduler Successfully assigned test/pod-liveness-tcpsocket to k8s-node1
Normal Pulled 2s kubelet Container image "nginx:1.17.1" already present on machine
Normal Created 2s kubelet Created container nginx
Normal Started 2s kubelet Started container nginx
HTTP GET 探针
[root@k8s-master ~]# vim pod-liveness-httpget.yaml
---
apiVersion: v1
kind: Pod
metadata:
name: pod-liveness-httpget
namespace: test
spec:
containers:
- name: nginx
image: nginx:1.17.1
ports:
- name: nginx-port
containerPort: 80
livenessProbe:
httpGet:
scheme: HTTP
port: 80
path: /hello
[root@k8s-master ~]# kubectl apply -f pod-liveness-httpget.yaml
pod/pod-liveness-httpget created
[root@k8s-master ~]# kubectl get pods -n test
NAME READY STATUS RESTARTS AGE
pod-hook-exec 1/1 Running 0 25m
pod-liveness-exec 1/1 Running 4 20m
pod-liveness-httpget 1/1 Running 0 17s
pod-liveness-tcpsocket 1/1 Running 0 4m54s
[root@k8s-master ~]# kubectl describe pods pod-liveness-httpget -n test
Name: pod-liveness-httpget
Namespace: test
Priority: 0
Node: k8s-node1/192.168.58.232
Start Time: Sun, 05 Jan 2025 06:45:24 -0500
Labels: <none>
Annotations: cni.projectcalico.org/containerID: e7ffbac450075928d7ebc0842a2f688795c09a064241c1a73a229e8d8e59eb24
cni.projectcalico.org/podIP: 10.244.36.74/32
cni.projectcalico.org/podIPs: 10.244.36.74/32
Status: Running
IP: 10.244.36.74
IPs:
IP: 10.244.36.74
Containers:
nginx:
Container ID: docker://c02adbbddcea56e54cf91801bb9caba9f692d760672b2cda96b4abf902beddb6
Image: nginx:1.17.1
Image ID: docker-pullable://nginx@sha256:b4b9b3eee194703fc2fa8afa5b7510c77ae70cfba567af1376a573a967c03dbb
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Sun, 05 Jan 2025 06:45:54 -0500
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Sun, 05 Jan 2025 06:45:26 -0500
Finished: Sun, 05 Jan 2025 06:45:54 -0500
Ready: True
Restart Count: 1
Liveness: http-get http://:80/hello delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-27q8x (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-27q8x:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 41s default-scheduler Successfully assigned test/pod-liveness-httpget to k8s-node1
Normal Pulled 11s (x2 over 40s) kubelet Container image "nginx:1.17.1" already present on machine
Normal Created 11s (x2 over 39s) kubelet Created container nginx
Normal Started 11s (x2 over 39s) kubelet Started container nginx
Normal Killing 11s kubelet Container nginx failed liveness probe, will be restarted
Warning Unhealthy 1s (x4 over 31s) kubelet Liveness probe failed: HTTP probe failed with statuscode: 404
[root@k8s-master ~]# sed -i 's#/hello#/#' pod-liveness-httpget.yaml
[root@k8s-master ~]# kubectl delete -f pod-liveness-httpget.yaml
pod "pod-liveness-httpget" deleted
[root@k8s-master ~]# kubectl apply -f pod-liveness-httpget.yaml
pod/pod-liveness-httpget created
[root@k8s-master ~]# kubectl describe pods pod-liveness-httpget -n test
Name: pod-liveness-httpget
Namespace: test
Priority: 0
Node: k8s-node1/192.168.58.232
Start Time: Sun, 05 Jan 2025 06:46:52 -0500
Labels: <none>
Annotations: cni.projectcalico.org/containerID: 049cd934b043be61f70f267c24783c6f65b6f0afafedcee6c20df9b060cae24a
cni.projectcalico.org/podIP: 10.244.36.75/32
cni.projectcalico.org/podIPs: 10.244.36.75/32
Status: Running
IP: 10.244.36.75
IPs:
IP: 10.244.36.75
Containers:
nginx:
Container ID: docker://e3f3d2883e90f561939dfaa3133f42ce757a1988667e74cce2358e6028465c71
Image: nginx:1.17.1
Image ID: docker-pullable://nginx@sha256:b4b9b3eee194703fc2fa8afa5b7510c77ae70cfba567af1376a573a967c03dbb
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Sun, 05 Jan 2025 06:46:54 -0500
Ready: True
Restart Count: 0
Liveness: http-get http://:80/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8mtjd (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-8mtjd:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5s default-scheduler Successfully assigned test/pod-liveness-httpget to k8s-node1
Normal Pulled 4s kubelet Container image "nginx:1.17.1" already present on machine
Normal Created 4s kubelet Created container nginx
Normal Started 3s kubelet Started container nginx