llama.cpp demo
git clone https://github.com/ggerganov/llama.cpp cd llama.cpp
修改Makefile使能mfma参数
MK_CFLAGS += -mfma -mf16c -mavx
MK_CXXFLAGS += -mfma -mf16c -mavx
安装python3依赖
cat ./requirements/requirements-convert_legacy_llama.txt
numpy~=1.26.4
sentencepiece~=0.2.0
transformers>=4.40.1,<5.0.0
gguf>=0.1.0
protobuf>=4.21.0,<5.0.0
依次pip3 install numpy/pip3 install sentencepiece/pip3 install transformers/pip3 install gguf/pip3 install protobuf
下载https://huggingface.co/4bit/Llama-2-7b-chat-hf
转换出llama-2-7b-chat.gguf
python3 convert_hf_to_gguf.py ./models/Llama-2-7b-chat-hf --outfile llama-2-7b-chat.gguf
启动
./llama-cli -m ./llama-2-7b-chat.gguf -co -cnv -p "You are a helpful assistant." -fa -ngl 80 -n 512
即可开始对话
== Running in interactive mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to the AI.
- To return control without starting a new line, end your input with '/'.
- If you want to submit another line, end your input with '\'.
<|im_start|>system
You are a helpful assistant.<|im_end|>
> hello
Hello! How can I help you today?
<|im_end|>
>
tell me about yourself
<|im_end|>
> can you speek Chinese ?
I apologize, but I don't speak Chinese. I'm just an AI trained to assist and provide helpful responses in English. Is there anything else I can help you with?
<|im_end|>
>
另外看看是不是可以直接下载 llama-2-7b-chat.gguf
https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/tree/main