当前位置：首页 > article >正文

昇腾服务器（Atlas800系列）部署embedding和rerank模型

article 2025/2/22 2:11:43

昇腾服务器部署embedding和rerank模型

1、确定安装环境

环境	型号	CANN版本
训练环境	Atlas800T A2服务器	CANN8.0.RC2及以上
推理环境	Atlas800I A2服务器	CANN8.0.RC2及以上
推理环境	Atlas300IDUO推理卡	CANN8.0.RC2及以上

2、获取下载包

资源包

可以使用wget命令下载：wget https://tools.obs.cn-south-292.ca-aicc.com:443/samples/llm/embed_rerank.tar.gz --no-check-certificate

3、基础环境配置

apt update && apt install curl build-essential autoconf libtool curl make g++ unzip wget libssl-dev pkg-config -y

4、创建一个conda环境

conda create -n Embedding --clone MindIE_1.0.RC2
conda activate Embedding

5、安装rust和protoc

安装rust

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

出现下所示，选择1即可。

1) Proceed with standard installation (default - just press enter)
2) Customize installation
3) Cancel installation

在这里插入图片描述

安装protobuf v21.12
软件包提供了protobuf的源代码。直接解压配置和编译

tar -zxvf protobuf-all-21.12.tar.gz
cd protobuf-21.12
./configure
make -j20
make install

在命令行执行如下命令：
export LD_LIBRARY_PATH=/usr/local/lib:$LIB_LIBRARY_PATH

6、安装应用依赖

在主目录：

pip install -r requirements.txt

安装Route

进入$work_dir/TEI/text-embeddings-inference执行如下命令：

../../cargo/bin/cargo install --path router -F python -F http --no-default-features

安装成功后如图：
在这里插入图片描述

进入$work_dir/TEI/text-embeddings-inference/backends/python/server
- 执行安装依赖和编译安装

make install 
pip install transformers==4.37.0
pip install safetensors==0.3.3
poetry install

安装后截图：
在这里插入图片描述

7、运行模型和测试

embedding
- 回到主目录下执行如下脚本：

start_im_embed.sh

在这里插入图片描述

测试：

curl 127.0.0.1:11027/embed \
    -X POST \
    -d '{"inputs":"What is Deep Learning?"}' \
    -H 'Content-Type: application/json'

推理结果
在这里插入图片描述

rerank
- 运行
  
  回到主目录下执行如下脚本：

start_im_rerank.sh

测试

curl 127.0.0.1:11028/rerank \
    -X POST \
    -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \
    -H 'Content-Type: application/json'

推理结果