Zero to JupyterHub with Kubernetes 下篇 - Jupyterhub on k8s
前言:纯个人记录使用。
- 搭建 Zero to JupyterHub with Kubernetes 上篇 - Kubernetes 离线二进制部署。
- 搭建 Zero to JupyterHub with Kubernetes 中篇 - Kubernetes 常规使用记录。
- 搭建 Zero to JupyterHub with Kubernetes 下篇 - Jupyterhub on k8s。
官方文档:Zero to JupyterHub with Kubernetes
**版本对应:**This documentation is for Helm chart version 2.0.0 that deploys JupyterHub version 3.0.0 and other components versioned in hub/images/requirements.txt. The Helm chart requires Kubernetes version >=1.20.0 and Helm >=3.5
组件 | 版本 |
---|---|
kubernetes | v1.20.4 |
jupyterhub-chart | 2.0.0 |
helm | v3.12.3 |
文章目录
- 第一部分: Setup Kubernetes
- 1、Setup Kubernetes
- 2、Setting up `helm`
- 第二部分: Setup JupyterHub
- 1、Installing JupyterHub
- 1.1 下载所需jupyterhub chart版本
- 1.2 下载相关离线镜像
- 1.3 加载镜像
- 1.4 jupyterhub 配置
- 1.4.1 预先配置pv与pvc
- 1.5 启动jupyterhub
- 1.6 jupyterhub 服务验证
第一部分: Setup Kubernetes
1、Setup Kubernetes
kubernetes-v1.20.4 离线二进制部署
[root@k8s-master /data/s0/kubernetes]$ kubectl version --short
Client Version: v1.20.4
Server Version: v1.20.4
2、Setting up helm
通过百度网盘分享的文件:helm-v3.12.3-linux-amd64.tar.gz
链接:https://pan.baidu.com/s/1f8xONKHWshHxieu7jEN4yA
提取码:1234
# 解压安装
[root@k8s-master /data/s0/kubernetes/helm]$ tar -xzvf helm-v3.12.3-linux-amd64.tar.gz
[root@k8s-master /data/s0/kubernetes/helm]$ ln -s /data/s0/kubernetes/helm/linux-amd64/helm /usr/local/bin
# 验证
[root@k8s-master /data/s0/kubernetes/helm]$ helm version
version.BuildInfo{Version:"v3.12.3", GitCommit:"3a31588ad33fe3b89af5a2a54ee1d25bfe6eaa5e", GitTreeState:"clean", GoVersion:"go1.20.7"}
第二部分: Setup JupyterHub
1、Installing JupyterHub
1.1 下载所需jupyterhub chart版本
JupyterHub’s Helm chart 仓库 --> jupyterhub-2.0.0.tgz
通过百度网盘分享的文件:jupyterhub-2.0.0.tgz
链接:https://pan.baidu.com/s/1ZrEHC9al29ye7n0W3UAi3g
提取码:1234
1.2 下载相关离线镜像
# 解压安装
[root@k8s-master /data/s0/kubernetes/helm]$ tar -xzvf jupyterhub-2.0.0.tgz # jupyterhub chart
# 查看所需镜像
[root@k8s-master /data/s0/kubernetes/helm]$ cat jupyterhub/Chart.yaml
annotations:
artifacthub.io/images: |
- image: jupyterhub/configurable-http-proxy:4.5.3
name: configurable-http-proxy
- image: jupyterhub/k8s-hub:2.0.0
name: k8s-hub
- image: jupyterhub/k8s-image-awaiter:2.0.0
name: k8s-image-awaiter
- image: jupyterhub/k8s-network-tools:2.0.0
name: k8s-network-tools
- image: jupyterhub/k8s-secret-sync:2.0.0
name: k8s-secret-sync
- image: jupyterhub/k8s-singleuser-sample:2.0.0
name: k8s-singleuser-sample
- image: k8s.gcr.io/kube-scheduler:v1.23.10 # helm upgrate 启动部署时,此版本有问题,改为v1.20.15,注意values.yaml中镜像名称修改,镜像保持一致
name: kube-scheduler
- image: k8s.gcr.io/pause:3.8 # 部署k8s时已下载安装,注意values.yaml中镜像名称修改,保持一致
name: pause
- image: k8s.gcr.io/pause:3.8
name: pausd
- image: traefik:v2.8.4
name: traefik
# 联网保存本地镜像
# 1. 下载保存 jupyterhub/configurable-http-proxy:4.5.3
> docker pull quay.io/jupyterhub/configurable-http-proxy:4.5.3
> docker tag quay.io/jupyterhub/configurable-http-proxy:4.5.3 jupyterhub/configurable-http-proxy:4.5.3
> docker save -o configurable-http-proxy:4.5.3.tar jupyterhub/configurable-http-proxy:4.5.3
# 2. 下载保存 jupyterhub/k8s-hub:2.0.0
> docker pull quay.io/jupyterhub/k8s-hub:2.0.0
> docker tag quay.io/jupyterhub/k8s-hub:2.0.0 jupyterhub/k8s-hub:2.0.0
> docker save -o k8s-hub:2.0.0.tar jupyterhub/k8s-hub:2.0.0
# 3. 下载保存 jupyterhub/k8s-image-awaiter:2.0.0
> docker pull quay.io/jupyterhub/k8s-image-awaiter:2.0.0
> docker tag quay.io/jupyterhub/k8s-image-awaiter:2.0.0 jupyterhub/k8s-image-awaiter:2.0.0
> docker save -o k8s-image-awaiter:2.0.0.tar jupyterhub/k8s-image-awaiter:2.0.0
# 4. 下载保存 jupyterhub/k8s-network-tools:2.0.0
> docker pull quay.io/jupyterhub/k8s-network-tools:2.0.0
> docker tag quay.io/jupyterhub/k8s-network-tools:2.0.0 jupyterhub/k8s-network-tools:2.0.0
> docker save -o k8s-network-tools:2.0.0.tar jupyterhub/k8s-network-tools:2.0.0
# 5. 下载保存 jupyterhub/k8s-secret-sync:2.0.0
> docker pull quay.io/jupyterhub/k8s-secret-sync:2.0.0
> docker tag quay.io/jupyterhub/k8s-secret-sync:2.0.0 jupyterhub/k8s-secret-sync:2.0.0
> docker save -o k8s-secret-sync:2.0.0.tar jupyterhub/k8s-secret-sync:2.0.0
# 6. 下载保存 jupyterhub/k8s-singleuser-sample:2.0.0
> docker pull m.daocloud.io/docker.io/jupyterhub/k8s-singleuser-sample:2.0.0
> docker tag m.daocloud.io/docker.io/jupyterhub/k8s-singleuser-sample:2.0.0 jupyterhub/k8s-singleuser-sample:2.0.0
> docker save -o k8s-singleuser-sample:2.0.0.tar jupyterhub/k8s-singleuser-sample:2.0.0
# 7. 下载保存 k8s.gcr.io/kube-scheduler:v1.20.15
> docker pull k8s-gcr.m.daocloud.io/kube-scheduler:v1.20.15
> docker tag k8s-gcr.m.daocloud.io/kube-scheduler:v1.20.15 k8s.gcr.io/kube-scheduler:v1.20.15
> docker save -o kube-scheduler:v1.20.15.tar k8s.gcr.io/kube-scheduler:v1.20.15
# 8. 下载保存 traefik:v2.8.4
> docker pull m.daocloud.io/docker.io/library/traefik:v2.8.4
> docker tag m.daocloud.io/docker.io/library/traefik:v2.8.4 traefik:v2.8.4
> docker save -o traefik:v2.8.4.tar traefik:v2.8.4
## 9. 将离线镜像打包上传
> tar -czvf jupyterhub-chart-images.tgz ./*
> scp jupyterhub-chart-images.tgz k8s-master:/data/s0/kubernetes/helm
1.3 加载镜像
# ------------------ k8s-matser,k8s-node1、k8s-node2 ----------------------------
# 1. 加载镜像,node1、node2节点同理
[root@k8s-master /data/s0/kubernetes/helm]$ tar -xzvf jupyterhub-chart-images.tgz -C ./chart-images
[root@k8s-master /data/s0/kubernetes/helm/chart-images]$ docker load -i configurable-http-proxy:4.5.3.tar
[root@k8s-master /data/s0/kubernetes/helm/chart-images]$ docker load -i k8s-hub:2.0.0.tar
[root@k8s-master /data/s0/kubernetes/helm/chart-images]$ docker load -i k8s-image-awaiter:2.0.0.tar
[root@k8s-master /data/s0/kubernetes/helm/chart-images]$ docker load -i k8s-network-tools:2.0.0.tar
[root@k8s-master /data/s0/kubernetes/helm/chart-images]$ docker load -i k8s-secret-sync:2.0.0.tar
[root@k8s-master /data/s0/kubernetes/helm/chart-images]$ docker load -i k8s-singleuser-sample:2.0.0.tar
[root@k8s-master /data/s0/kubernetes/helm/chart-images]$ docker load -i kube-scheduler:v1.20.15.tar
[root@k8s-master /data/s0/kubernetes/helm/chart-images]$ docker load -i traefik:v2.8.4.tar
# 2.加载自定义用户科学环境;默认的单用户服务器jupyter镜像 k8s-singleuser-sample
# docker pull m.daocloud.io/docker.io/jupyter/datascience-notebook 默认拉取最新版本,最好指定版本,否则每次拉最新的
[root@k8s-master /data/s0/kubernetes/helm]$ docker load -i datascience-notebook.tar
# 注意:k8s在不指定镜像拉取策略imagePullPolicy的情况下,如果镜像标签tag:latest,imagePullPolicy默认值为“Always” 总是从镜像库拉取;
# 如果镜像标签tag不是:latest,imagePullPolicy默认值为“IfNotPresent” 本地有使用本地镜像,本地没有则拉取镜像库;
[root@k8s-master /data/s0/kubernetes/helm]$ docker tag jupyter/datascience-notebook:latest jupyter/datascience-notebook:2023.10.23
1.4 jupyterhub 配置
# jupyterhub 自定义配置
[root@datanode40 /data/s0/kubernetes/helm]$ touch config.yaml
[root@datanode40 /data/s0/kubernetes/helm]$ vim config.yaml
config.yaml 内容如下:
# 应用名称(deployment、service、pod等资源对象名称)
fullnameOverride: "jupyterhub"
# 拉取镜像时,相关仓库身份认证(使用本机离线镜像)
imagePullSecret:
create: false
automaticReferenceInjection: false
# hub服务pod配置(auth权限认证)
hub:
revisionHistoryLimit: 1 # Kubernetes 中保留的历史版本数量
config: # jupyterhub_cnfig.py 配置文件内容
JupyterHub:
admin_access: true
admin_users:
- zyp # 设置管理员用户
authenticator_class: dummy # 用户验证,测试采用虚拟验证
service:
type: ClusterIP
ports:
nodePort:
db:
type: sqlite-pvc # JupyterHub 使用数据库,存储用户信息、服务器状态、活动记录等数据
pvc: # 需要预先创建对应pv
accessModes:
- ReadWriteOnce
storage: 2Gi
subPath: sqlite # PV存储卷子路径,默认根路径
storageClassName: sqlite-pv # 存储类别
image:
name: jupyterhub/k8s-hub
tag: "2.0.0"
pullPolicy: IfNotPresent
#设置 chp(configurable-http-proxy)pod的代理、公网代理、https代理相关
proxy:
service:
type: NodePort # 公网代理服务
nodePorts:
http: 30081
chp: # configurable-http-proxy (chp)配置
revisionHistoryLimit: 1
image:
name: jupyterhub/configurable-http-proxy
tag: "4.5.3"
pullPolicy: IfNotPresent
https:
enabled: false # 禁用https
# 单用户jupyter服务器
singleuser:
networkTools:
image:
name: jupyterhub/k8s-network-tools
tag: "2.0.0"
pullPolicy: IfNotPresent
storage: # 配置单用户环境存储
type: static # 静态挂载方式
static:
pvcName: notebook-pvc # 存储pvc名称,需手动创建pvc和pv
subPath: "{username}"
capacity: 10Gi
homeMountPath: /home/jovyan # 容器中挂载主文件夹存储的位置
# Defines the default image
image:
name: jupyterhub/k8s-singleuser-sample # 可修改为自己的科学计算环境
tag: "2.0.0"
pullPolicy: IfNotPresent
profileList: # 用户科学环境选择
- display_name: "sample environment"
description: "To avoid too much bells and whistles: Python."
default: true
- display_name: "Datascience environment"
description: "If you want the additional bells and whistles: Python, R, and Julia."
kubespawner_override:
image: jupyter/datascience-notebook:2023.10.23
pullPolicy: IfNotPresent
startTimeout: 300
cpu:
limit:
guarantee: 0.5
memory:
limit:
guarantee: 1G
cmd: jupyterhub-singleuser # 容器内,启动单用户服务器的命令
defaultUrl: "/lab" # 用户jupyter界面
extraEnv:
JUPYTERHUB_SINGLEUSER_APP: "jupyter_server.serverapp.ServerApp"
# k8s 容器调度相关
scheduling:
userScheduler:
revisionHistoryLimit: 1
image:
name: k8s.gcr.io/kube-scheduler
tag: "v1.20.15"
pullPolicy: IfNotPresent
userPlaceholder:
image:
name: k8s.gcr.io/pause
tag: "3.8"
pullPolicy: IfNotPresent
# 镜像拉取器
prePuller:
hook:
enabled: false # 离线环境,本地镜像,无需拉取
pullOnlyOnChanges: false
continuous:
enabled: false
pullProfileListImages: false
# 空闲进程杀死服务
cull:
enabled: true
users: false # --cull-users
adminUsers: true # --cull-admin-users
removeNamedServers: false # --remove-named-servers
timeout: 3600 # --timeout
every: 600 # --cull-every
concurrency: 10 # --concurrency
maxAge: 0 # --max-age
1.4.1 预先配置pv与pvc
pv 持久化参见 Kubernetes 常规使用记录。
# 配置sqlite存储的PV 和 单用户服务器存储的pv和PVC
[root@k8s-node1 /data/s0/kubernetes/nfs]$ mkdir pvs
[root@k8s-node1 /data/s0/kubernetes/nfs/pvs]$ vim pvs.yaml
# sqlite存储的PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: sqlite-pv1
spec:
nfs:
path: /data/s0/kubernetes/nfs/pv1
readOnly: false
server: k8s-node1
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
storageClassName: sqlite-pv
persistentVolumeReclaimPolicy: Retain
---
# 单用户服务器pv
apiVersion: v1
kind: PersistentVolume
metadata:
name: notebook-pv2
spec:
nfs:
path: /data/s0/kubernetes/nfs/pv2
readOnly: false
server: k8s-node1
capacity:
storage: 200Gi
accessModes:
- ReadWriteMany
storageClassName: single-notebook
persistentVolumeReclaimPolicy: Retain
---
# 单用户服务器pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: notebook-pvc # 与配置文件对应
namespace: jhub
spec:
accessModes:
- ReadWriteMany
storageClassName: single-notebook
resources:
requests:
storage: 20Gi
1.5 启动jupyterhub
# 创建命名空间
[root@k8s-master /data/s0/kubernetes/helm]$ kubectl create ns jhub
# 启动预设pvc和pv
[root@k8s-node1 /data/s0/kubernetes/nfs/pvs]$ kubectl apply -f pvs.yaml
# 启动jupyterhub
[root@k8s-master /data/s0/kubernetes/helm]$ helm upgrade --cleanup-on-fail \
--install jupyterhub-release ./jupyterhub \
--namespace jhub \
--values config.yaml
# 验证pod运行状态(若存在pod 状态 Pending or ContainerCreating --》 kubectl --namespace=jhub describe pod <name of pod>)
[root@k8s-master /data/s0/kubernetes/helm]$ kubectl --namespace=jhub get pod
jupyterhub-hub-c87985f75-lkl4f 1/1 Running 0 5m18s
jupyterhub-proxy-5d95bb6786-87cqs 1/1 Running 0 5m18s
jupyterhub-user-scheduler-786c6759c7-2r24k 1/1 Running 0 5m18s
jupyterhub-user-scheduler-786c6759c7-6x5k6 1/1 Running 0 5m18s
# 验证是否为k8s服务jupyterhub-proxy-public提供了外部IP
[root@k8s-master /data/s0/kubernetes/helm]$ kubectl --namespace=jhub get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
jupyterhub-hub ClusterIP 10.0.0.50 <none> 8081/TCP 90s
jupyterhub-proxy-api ClusterIP 10.0.0.196 <none> 8001/TCP 90s
jupyterhub-proxy-public NodePort 10.0.0.51 <none> 80:30081/TCP 90s
问题:Error: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole “jupyterhub-user-scheduler” in namespace “” exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key “meta.helm.sh/release-name” must equal “jupyterhub-release”: current value is “jupyterhub-v1”
解决:
kubectl delete clusterrole jupyterhub-user-scheduler
kubectl delete clusterrolebinding jupyterhub-user-scheduler
1.6 jupyterhub 服务验证
远程主机登录 http://k8s-matser:80081
-
用户登录界面
-
科学计算环境选择界面
-
用户分析操作界面
-
底层单用户容器
- 持久化存储查看
- k8s管理界面查看