当前位置：首页 > article >正文

Elasticsearch 开放推理 API 新增阿里云 AI 搜索支持

article 2025/2/21 3:52:05

作者：来自 Elastic Dave Kyle, 阿里云：Weizijun

我们很高兴地宣布，Elasticsearch 开放推理 API 新增了阿里云 AI 搜索的集成功能。此项工作使 Elastic 用户能够直接连接到阿里云 AI 平台。使用 Elasticsearch 向量数据库构建 RAG 应用程序的开发人员可以使用 semantic_text 存储和使用托管在阿里云 AI 搜索平台上的模型生成的密集和稀疏嵌入。此外，Elastic 用户现在可以集成访问重新排名模型，以增强语义重新排名和 Qwen LLM 系列。

在本博客中，我们将探讨如何将阿里云的 AI 服务与 Elasticsearch 集成。你将学习如何在 Elasticsearch 中设置和使用阿里巴巴的 completion、重新排名（rerank）、稀疏嵌入（sparse embedding）和文本嵌入（text embedding）服务。集成到推理任务类型中的广泛支持模型将增强包括 RAG 在内的许多用例的相关性。

我们感谢阿里巴巴团队为 Elasticsearch 开放推理 API 提供对这些任务类型的支持！

让我们通过示例来了解如何在 Elasticsearch 环境中配置和使用这些服务。注意阿里巴巴使用术语 service_id 而不是 model_id。

在阿里云 AI 搜索平台中使用基础模型

本演练假设你已经拥有一个可以访问阿里云 AI 搜索平台的阿里云帐户。接下来，你需要创建一个工作区（workspace）和 API 密钥以创建推理。

在 Elasticsearch 中创建推理 API 端点

在 Elasticsearch 中，通过提供服务 “alibabacloud-ai-search” 以及服务设置（包括你的工作区、主机、服务 ID 和你的 API 密钥）来创建端点，以访问阿里云 AI 搜索平台。在我们的示例中，我们使用 “ops-text-embedding-001” 作为服务 ID 创建文本嵌入端点。

PUT _inference/text_embedding/ali_ai_embeddings
{
    "service": "alibabacloud-ai-search",
    "service_settings": {
        "api_key": "<api_key>",
        "service_id": "ops-text-embedding-001",
        "host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
        "workspace": "default"
    }
}

你将收到来自 Elasticsearch 的响应，其中表示端点已成功创建：

{
  "inference_id": "ali_ai_embeddings",
  "task_type": "text_embedding",
  "service": "alibabacloud-ai-search",
  "service_settings": {
    "similarity": "dot_product",
    "dimensions": 1536,
    "service_id": "ops-text-embedding-001",
    "host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
    "workspace": "default",
    "rate_limit": {
      "requests_per_minute": 10000
    }
  },
  "task_settings": {}
}

请注意，模型创建无需额外设置。Elasticsearch 将自动连接到阿里云 AI 搜索平台，以测试你的凭据和服务 ID，并为你填写维度数量和相似度度量。

接下来，让我们测试我们的端点，以确保一切设置正确。为此，我们将调用执行推理 API：

POST _inference/text_embedding/ali_ai_embeddings
{
  "input": "What is Elastic?"
}

API 调用将返回所提供输入的生成的嵌入，如下所示：

{
    "text_embedding": [
        {
            "embedding": [
                0.048400473,
                0.051464397,
                … (additional values) …
                0.033325635,
                -0.008986305
            ]
        }
    ]
}

你现在可以开始探索了。尝试了这些示例后，请查看 Elasticsearch 中针对语义搜索用例的一些令人兴奋的创新：

新的 semantic_text 字段简化了嵌入的存储和分块 - 只需选择你的模型，Elastic 就会完成剩下的工作！
在 8.14 中引入的 retrievers 允许你设置多阶段检索管道

但首先，让我们深入了解我们的例子！

I. Completion

首先，阿里云提供了几种 chat completion 模型，服务 ID 列在其 API 文档中。

步骤 1：配置补全服务

首先，设置 text completion 推理服务：

PUT _inference/completion/ali-chat
{
 "service": "alibabacloud-ai-search",
 "service_settings": {
     "host" : "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
     "api_key": "xxxxxxxxxxxxxxxxxx",
     "service_id": "ops-qwen-turbo",
     "workspace" : "default"
 }
}

相应：

{
 "inference_id": "ali-chat",
 "task_type": "completion",
 "service": "alibabacloud-ai-search",
 "service_settings": {
   "service_id": "ops-qwen-turbo",
   "host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
   "workspace": "default",
   "rate_limit": {
     "requests_per_minute": 1000
   }
 },
 "task_settings": {}
}

第 2 步：发出完成请求

使用配置的端点，发送 POST 请求以生成完成：

POST _inference/completion/ali-chat
{
 "input":["Where is the capital of Henan?"]
}

{
 "completion": [
   {
     "result": "The capital of Henan is Zhengzhou."
   }
 ]
}

独特的是，对于与阿里巴巴的 Elastic Inference API 集成，聊天历史记录可以包含在输入中，在此示例中，我们包含了之前的回复并添加：“What fun things are there?”

POST _inference/completion/ali-chat
{
 "input":["Where is the capital of Henan?", "The capital of Henan is Zhengzhou.", "What fun things are there?" ]
}

答复中明确提到了历史聊天记录：

{
 "completion": [
   {
     "result": "I'm sorry, I do not have enough information to provide a specific list of fun things to do in Zhengzhou, Henan. I can only tell you that Zhengzhou is the capital of Henan province. To find out about fun activities, attractions, or events in Zhengzhou, I would suggest researching local tourism websites, asking locals, or checking out travel guides for the area."
   }
 ]
}

在未来的更新中，我们计划允许用户明确包含聊天记录，以提高易用性。

II. 重新排序 - rerank

接下来是下一个任务类型，重新排序。使用阿里巴巴强大的模型，重新排序有助于重新排序搜索结果以提高相关性。如果你想了解有关此概念的更多信息，请查看 Elastic Search Labs 上的此博客。

步骤 1：配置重新排序服务

配置重新排序推理服务：

PUT _inference/rerank/ali-rank
{
 "service": "alibabacloud-ai-search",
 "service_settings": {
   "api_key": "xxxxxxxxxxxxxxxxxx",
   "service_id": "ops-bge-reranker-larger",
   "host" : "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
   "workspace" : "default"   
 }
}

{
 "inference_id": "ali-rank",
 "task_type": "rerank",
 "service": "alibabacloud-ai-search",
 "service_settings": {
   "service_id": "ops-bge-reranker-larger",
   "host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
   "workspace": "default",
   "rate_limit": {
     "requests_per_minute": 1000
   }
 },
 "task_settings": {}
}

第 2 步：发出重新排名请求

发送 POST 请求以重新排名你的搜索查询结果：

重新排名接口不需要大量配置（task_settings），它会返回按最相关顺序排列的相关性分数以及输入数组中文档的索引。

POST _inference/rerank/ali-rank
{
 "query": "What is the capital of the USA?",
 "input": [
   "Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, Carson City had a population of 55,274.",


   "Capital punishment (the death penalty) has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states.",


   "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that are a political division controlled by the United States. Its capital is Saipan.",


   "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",


   "Charlotte Amalie is the capital and largest city of the United States Virgin Islands. It has about 20,000 people. The city is on the island of Saint Thomas.",


   "North Dakota is a state in the United States. 672,591 people lived in North Dakota in the year 2010. The capital and seat of government is Bismarck."
 ]
}

{
 "rerank": [
   {
     "index": 3,
     "relevance_score": 0.9998832
   },
   {
     "index": 4,
     "relevance_score": 0.008847355
   },
   {
     "index": 5,
     "relevance_score": 0.0026626128
   },
   {
     "index": 0,
     "relevance_score": 0.00068250194
   },
   {
     "index": 2,
     "relevance_score": 0.00019716943
   },
   {
     "index": 1,
     "relevance_score": 0.00011591934
   }
 ]
}

III. 稀疏嵌入 - Sparse embedding

阿里巴巴提供了专门针对稀疏嵌入的模型，我们将使用 ops-text-sparse-embedding-001 作为示例。

步骤 1：配置稀疏嵌入服务

PUT _inference/sparse_embedding/ali-sparse-embedding
{
 "service": "alibabacloud-ai-search",
 "service_settings": {
   "api_key": "xxxxxxxxxxxxxxxxxx",
   "service_id": "ops-text-sparse-embedding-001",
   "host" : "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
   "workspace" : "default"
 }
}

{
 "inference_id": "ali-sparse-embedding",
 "task_type": "sparse_embedding",
 "service": "alibabacloud-ai-search",
 "service_settings": {
   "service_id": "ops-text-sparse-embedding-001",
   "host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
   "workspace": "default",
   "rate_limit": {
     "requests_per_minute": 1000
   }
 },
 "task_settings": {}
}

第 2 步：发出稀疏嵌入查询

稀疏具有以下 task_settings：

input_type - 摄取或搜索
return_token - 如果为 true，则在响应中包含标记文本，否则为数字

POST _inference/sparse_embedding/ali-sparse-embedding
{
 "input": "Hello world",
 "task_settings": {
   "input_type": "search",
   "return_token": true
 }
}

{
 "sparse_embedding": [
   {
     "is_truncated": false,
     "embedding": {
       "hello": 0.27783203,
       "world": 0.28222656
     }
   }
 ]
}

使用 return_token==false

{
 "sparse_embedding": [
   {
     "is_truncated": false,
     "embedding": {
       "8999": 0.28222656,
       "35378": 0.27783203
     }
   }
 ]
}

IV. 文本嵌入

阿里巴巴还针对不同任务提供文本嵌入模型。

步骤 1：配置文本嵌入服务

嵌入有一个 task_setting：

input_type - 提取或搜索

PUT _inference/text_embedding/ali-embeddings
{
 "service": "alibabacloud-ai-search",
 "service_settings": {
   "api_key": "xxxxxxxxxxxxxxxxxx",
   "service_id": "ops-text-embedding-001",
   "host" : "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
   "workspace" : "default"
 }
}

{
 "inference_id": "ali-embeddings",
 "task_type": "text_embedding",
 "service": "alibabacloud-ai-search",
 "service_settings": {
   "service_id": "ops-text-embedding-001",
   "host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
   "workspace": "default",
   "rate_limit": {
     "requests_per_minute": 1000
   },
   "similarity": "dot_product",
   "dimensions": 1536
 },
 "task_settings": {}
}

第 2 步：发出文本嵌入请求

发送 POST 请求以生成文本嵌入：

POST _inference/text_embedding/ali-embeddings
{
 "input": "Hello world"
}

{
 "text_embedding": [
   {
     "embedding": [
       -0.017036675,
       0.07038724,
       0.044685286,
       0.0064531807,
       0.013290042,
       0.011183944,
       -0.0020014185,
       -0.009508779,
…