GraphRAG 知识图谱,设置适配阿里云百炼平台实战教程【上】
1、首先使用uv新增一个Graphrag项目
uv init graphrag
2、切换进项目目录
cd graphrag
3、激活项目
source .venv/bin/activate
4、安装graphrag依赖
uv pip install graphrag
5、新增目录存在用于存放graphrag ,并且把微软的测试数据下载到目录下
mkdir -p ./openl/input
curl https://www.gutenberg.org/cache/epub/24022/pg24022 -o ./openl/input/book.txt
6、初始化项目
graphrag init --root ./openl
7、修改./openl 目录下的settings.yaml 和 .env 文件(只要一个key,改成自己即可),标红部分属于修改或者添加
models:
default_chat_model:
type: openai_chat # or azure_openai_chat
api_base: https://dashscope.aliyuncs.com/compatible-mode/v1
# api_version: 2024-05-01-preview
auth_type: api_key # or azure_managed_identity
api_key: ${GRAPHRAG_API_KEY} # set this in the generated .env file
# audience: "https://cognitiveservices.azure.com/.default"
# organization: <organization_id>
encoding_model: cl100k_base # 关键参数,适配阿里百炼encoding
model: qwen-plus #修改成千问的市场模型
# deployment_name: <azure_model_deployment_name>
# encoding_model: cl100k_base # automatically set by tiktoken if left undefined
model_supports_json: true # recommended if this is available for your model.
concurrent_requests: 25 # max number of simultaneous LLM requests allowed
async_mode: threaded # or asyncio
retry_strategy: native
max_retries: -1 # set to -1 for dynamic retry logic (most optimal setting based on server response)
tokens_per_minute: 0 # set to 0 to disable rate limiting
requests_per_minute: 0 # set to 0 to disable rate limiting
embedding_model 配置修改
default_embedding_model:
type: openai_embedding # or azure_openai_embedding
api_base: https://dashscope.aliyuncs.com/compatible-mode/v1
# api_version: 2024-05-01-preview
auth_type: api_key # or azure_managed_identity
api_key: ${GRAPHRAG_API_KEY}
# audience: "https://cognitiveservices.azure.com/.default"
# organization: <organization_id>
encoding_model: cl100k_base # 显式指定编码器
model: text-embedding-v3 # 使用阿里云官方的嵌入模型名称
# deployment_name: <azure_model_deployment_name>
# encoding_model: cl100k_base # automatically set by tiktoken if left undefined
model_supports_json: true # recommended if this is available for your model.
concurrent_requests: 25 # max number of simultaneous LLM requests allowed
async_mode: threaded # or asyncio
retry_strategy: native
max_retries: -1 # set to -1 for dynamic retry logic (most optimal setting based on server response)
tokens_per_minute: 0 # set to 0 to disable rate limiting
requests_per_minute: 0 # set to 0 to disable rate limiting
就修改这个地方,执行索引,代码就能运行起来
graphrag index --root ./openl
最后结果是失败的