当前位置: 首页 > article >正文

服务器部署基于Deepseek的检索增强知识库

AI Server Info

2颗(64核128线程主频2.9G,睿频3.5G)
主板:超微X12工作站主板
内存:三星RECC DDR4 32G 3200服务器校验内存 * 4
硬盘:金士顿1T NVME PCIE4.0高速固态
显卡:英伟达(NVIDIA)GeForce RTX 4090 24G * 2

1. Server info

# see if x86_64
uname -m
# see GPU
lspci | grep VGA
# output is NVIDIA GeForce RTX 4090 not AMD GPU
# 31:00.0 VGA compatible controller: NVIDIA Corporation AD102 [GeForce RTX 4090] (rev a1)
#see code name and more, Codename:    noble
cat /etc/os-release
lsb_release -a
hostnamectl

2. anaconda

wget https://repo.anaconda.com/archive/Anaconda3-2024.10-1-Linux-x86_64.sh
bash Anaconda3-2024.10-1-Linux-x86_64.sh
source ~/anaconda3/bin/activate
conda --version
conda update conda

3. ollama

see install doc

# remove first
# sudo rm -rf /usr/lib/ollama
# install auto
curl -fsSL https://ollama.com/install.sh | sh

# or install manual
# using NVIDIA GeForce RTX 4090, no need install ROCm
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
scp ~/Downloads/ollama-linux-amd64.tgz lwroot0@192.168.0.20:~/instal
# unzip to /usr[/lib/ollama]
sudo tar -C /usr -xzf ollama-linux-amd64.tgz

# start
ollama serve
# statue
ollama -v
Adding Ollama as a startup service

Create a user and group for Ollama:

sudo useradd -r -s /bin/false -U -m -d /usr/share/ollama ollama
sudo usermod -a -G ollama $(whoami)

Create a service file in /etc/systemd/system/ollama.service:

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"
Environment="OLLAMA_MODEL_PATH=/usr/share/ollama/.ollama/models/"
Environment="OLLAMA_HOST=0.0.0.0"

[Install]
WantedBy=default.target

Then start the service:

sudo systemctl daemon-reload
sudo systemctl enable ollama

Add to user env:

vi ~/.bashrc
# add
# export OLLAMA_MODEL_PATH=/usr/share/ollama/.ollama/models/
# export OLLAMA_HOST=0.0.0.0

source ~/.bashrc
echo $OLLAMA_MODEL_PATH
run AI model

You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
see models lib of ollama
model will saved in ~/.ollama/models/ or OLLAMA_MODEL_PATH

模型名称规模代码
deepseek-r114bollama run deepseek-r1:14b
32bollama run deepseek-r1:32b
deepseek-v216bollama run deepseek-v2
qwen2.514bollama run qwen2.5:14b
phi414b onlyollama run phi4
glm49b onlyollama run glm4
llama3.18bollama run llama3.1

4. docker

see doc

# update
sudo apt update
sudo apt upgrade

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

# aliyun mirror
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo tee /etc/apt/keyrings/docker.asc
sudo sh -c 'echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.asc] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" > /etc/apt/sources.list.d/docker.list'
sudo apt-get update

# install latest version
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# add mirror
sudo vi /etc/docker/daemon.json
{
“registry-mirrors”:[
"https://docker.registry.cyou",
"https://docker-cf.registry.cyou",
"https://dockercf.jsdelivr.fyi",
"https://docker.jsdelivr.fyi",
"https://dockertest.jsdelivr.fyi",
"https://mirror.aliyuncs.com",
"https://dockerproxy.com",
"https://mirror.baidubce.com",
"https://docker.m.daocloud.io",
"https://docker.nju.edu.cn",
"https://docker.mirrors.sjtug.sjtu.edu.cn",
"https://docker.mirrors.ustc.edu.cn",
"https://mirror.iscas.ac.cn",
"https://docker.rainbond.cc"
]
}
MaxKB

模型概况

docker run -d --name=maxkb --restart=always -p 7861:8080 -v ~/.maxkb:/var/lib/postgresql/data -v ~/.python-packages:/opt/maxkb/app/sandbox/python-packages 1panel/maxkb

# test connect to ollama
sudo docker exec -it maxkb bash
curl http://192.168.0.20:11434/
# output Ollama is runningroot@a7c89e320e86

visit: http://your_ip/7861
默认账号信息(首次登录系统强制修改):
username: admin
password: MaxKB@123…


http://www.kler.cn/a/553156.html

相关文章:

  • AllData数据中台核心菜单十三:数据湖平台
  • deepseek-r1系列模型部署分别需要的最低硬件配置
  • 解析DrugBank数据库数据|Python
  • KTransformers如何通过内核级优化、多GPU并行策略和稀疏注意力等技术显著加速大语言模型的推理速度?
  • JVM 类加载器深度解析(含实战案例)
  • 有名管道的空间大小
  • [实现Rpc] 消息抽象层的具体实现
  • IO进程 day01
  • MySQL 安装过程记录以及安装选项详解
  • 寒假总结。
  • 基于Java(JSP)+MySQL设计与实现的 MVC 鲜花订购系统
  • “以数治税”时代 数据要素的价值挖掘
  • 昇腾DeepSeek模型部署优秀实践及FAQ
  • 图解长短期记忆网络(LSTM)
  • Yocto项目:如何部署AI——完整指南*
  • 基于开源Odoo、SKF Phoenix API与IMAX-8数采网关的圆织机设备智慧运维实施方案 ——以某纺织集团圆织机设备管理场景为例
  • SpringCloud面试题----什么是Feign?是如何实现负载均衡的
  • OSPF(开放路径最短优先)
  • JAX-RS与JAXB:实现XML数据交互的完整指南
  • 萌新学 Python 之 if 语句的三目运算符