当前位置：首页 > article >正文

‌GraphRAG 知识图谱，设置适配阿里云百炼平台实战教程【上】

article 2025/4/1 18:21:33

1、首先使用uv新增一个Graphrag项目

uv init graphrag

2、切换进项目目录

cd graphrag

3、激活项目

source .venv/bin/activate

4、安装graphrag依赖

uv pip install graphrag

5、新增目录存在用于存放graphrag ,并且把微软的测试数据下载到目录下

 mkdir -p ./openl/input

curl https://www.gutenberg.org/cache/epub/24022/pg24022 -o ./openl/input/book.txt

6、初始化项目

graphrag init --root ./openl

7、修改./openl 目录下的settings.yaml 和 .env 文件（只要一个key，改成自己即可），标红部分属于修改或者添加

models:
  default_chat_model:
    type: openai_chat # or azure_openai_chat
    api_base: https://dashscope.aliyuncs.com/compatible-mode/v1
    # api_version: 2024-05-01-preview
    auth_type: api_key # or azure_managed_identity
    api_key: ${GRAPHRAG_API_KEY} # set this in the generated .env file
    # audience: "https://cognitiveservices.azure.com/.default"
    # organization: <organization_id>
    encoding_model: cl100k_base  # 关键参数，适配阿里百炼encoding
    model: qwen-plus #修改成千问的市场模型
    # deployment_name: <azure_model_deployment_name>
    # encoding_model: cl100k_base # automatically set by tiktoken if left undefined
    model_supports_json: true # recommended if this is available for your model.
    concurrent_requests: 25 # max number of simultaneous LLM requests allowed
    async_mode: threaded # or asyncio
    retry_strategy: native
    max_retries: -1                   # set to -1 for dynamic retry logic (most optimal setting based on server response)
    tokens_per_minute: 0              # set to 0 to disable rate limiting
    requests_per_minute: 0            # set to 0 to disable rate limiting

embedding_model 配置修改

  default_embedding_model:
    type: openai_embedding # or azure_openai_embedding
    api_base: https://dashscope.aliyuncs.com/compatible-mode/v1
    # api_version: 2024-05-01-preview

    auth_type: api_key # or azure_managed_identity
    api_key: ${GRAPHRAG_API_KEY}
    # audience: "https://cognitiveservices.azure.com/.default"
    # organization: <organization_id>
    encoding_model: cl100k_base  # 显式指定编码器
    model: text-embedding-v3     # 使用阿里云官方的嵌入模型名称
    # deployment_name: <azure_model_deployment_name>
    # encoding_model: cl100k_base # automatically set by tiktoken if left undefined
    model_supports_json: true # recommended if this is available for your model.
    concurrent_requests: 25 # max number of simultaneous LLM requests allowed
    async_mode: threaded # or asyncio
    retry_strategy: native
    max_retries: -1                   # set to -1 for dynamic retry logic (most optimal setting based on server response)
    tokens_per_minute: 0              # set to 0 to disable rate limiting
    requests_per_minute: 0            # set to 0 to disable rate limiting

就修改这个地方，执行索引，代码就能运行起来