minikube 的 Kubernetes 入门教程--Ollama
让我们在本地 Kubernetes 中部署 Ollama(llama3.2)
Ollama 安装方式有多样,是可选的,因为有minikube环境,我在 Kubernetes 中Deployment它。调用Ollama镜像,也可以用 Ollama CLI 去运行(run)llama或deepseek,在Kubernetes中操作还是方便的。
创建namespace(ollama)
$ kubectl create namespace ollama
namespace/ollama created
执行编写ollama的deployment和service的yaml文件,得到pod和svc,网络的原因拉取image花了接近35分钟,Pod才切换到running状态
$ kubectl get pods,svc -n ollama
NAME READY STATUS RESTARTS AGE
pod/ollama-55bdc97d8f-z4lw5 0/1 ContainerCreating 0 24m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ollama NodePort 10.104.206.83 <none> 80:31772/TCP 23m
$ kubectl describe pod/ollama-55bdc97d8f-z4lw5 -n ollama
....
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 26m default-scheduler Successfully assigned ollama/ollama-55bdc97d8f-z4lw5 to minikube
Normal Pulling 26m kubelet Pulling image "ollama/ollama:latest"
$ kubectl get pods,svc -n ollama
NAME READY STATUS RESTARTS AGE
pod/ollama-55bdc97d8f-z4lw5 1/1 Running 0 35m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ollama NodePort 10.104.206.83 <none> 80:31772/TCP 34m
$ minikube service ollama -n ollama --url
http://192.168.49.2:31772
接下来,在ollama中run llama3.2
$ kubectl exec -it pod/ollama-55bdc97d8f-z4lw5 -n ollama -- sh
# ollama -v
ollama version is 0.5.7-0-ga420a45-dirty
++++llama+++++
# ollama run llama3.2
pulling manifest
pulling dde5aa3fc5ff... 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 2.0 GB
pulling 966de95ca8a6... 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 1.4 KB
pulling fcc5a6bec9da... 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 7.7 KB
pulling a70ff7e570d9... 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 6.0 KB
pulling 56bb8bd477a5... 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 96 B
pulling 34bb5ab01051... 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 561 B
verifying sha256 digest
writing manifest
success
>>> hi
How can I assist you today?
>>>
Generate request (Streaming)
在run最后,创建了1个对话的交互,也可以通过curl去验证api是否工作正常
$ curl http://192.168.49.2:31772/api/generate -d '{
"model": "llama3.2",
"prompt": "Why is the sky blue?"
}'
{"model":"llama3.2","created_at":"2025-02-04T13:44:53.135357039Z","response":"The","done":false}
{"model":"llama3.2","created_at":"2025-02-04T13:44:53.343546503Z","response":" sky","done":false}
{"model":"llama3.2","created_at":"2025-02-04T13:44:53.527870267Z","response":" appears","done":false}
{"model":"llama3.2","created_at":"2025-02-04T13:44:53.711866139Z","response":" blue","done":false}
{"model":"llama3.2","created_at":"2025-02-04T13:44:53.883596915Z","response":" because","done":false}
{"model":"llama3.2","created_at":"2025-02-04T13:44:54.063993698Z","response":" of","done":false}
{"model":"llama3.2","created_at":"2025-02-04T13:44:54.236673872Z","response":" a","done":false}
返回Dify界面>>配置>>模型供应商
返回首页>>创建空白应用>>聊天助手(chatbot)
点击新创建的应用>>聊天助手(chatbot)进入,点击发布>>在“探索”中打开
基于设备(laptop)硬件性能和模型的性能,对话过程中响应有些慢。