当前位置：首页 > article >正文

重生之我们在ES顶端相遇第 19 章 - 综合排序(进阶)，打造你的个性化排序

article 2025/3/6 11:25:00

文章目录

- - 0. 前言
  - 1. Function score query 是什么
  - 2. field_value_factor 基本使用
  - - 2.1 查询例子
    - 2.2 算分解释
    - 2.3 参数说明
  - 3. random_score 基本使用
  - - 3.1 查询例子
    - 3.2 参数说明
  - 4. weight、script_score
  - 5. 衰退函数
  - - 5.1 gauss
    - - 5.1.1 主要参数
      - 5.1.2 查询例子
    - 5.2 exp
    - 5.3 linear

0. 前言

假设有这样一个业务场景：
如何实现类似电商搜索按照价格、销量、评分等因素的实时综合排序？

要实现这样的功能，就需要引出我们的主角了：Function score query。

1. Function score query 是什么

Function score query 主要用于修改文档的算分场景，在开发中使用频率非常高。
该查询 API, 有以下参数：

query
- 作用： 定义要过滤和被修改评分的文档。可以是任何有效的查询
boost
- 描述： 浮点数值，用于调整文档得分
- 作用： 修改所有返回文档的得分
functions
- 描述： 函数数组，定义如何调整得分
- 作用： functions 中的每个函数都可修改文档得分，包括以下函数
  - weight： 给文档加权重
  - random_score： 使用随机值调整得分
  - field_value_factor： 基于文档字段值计算得分
  - script_score： 使用脚本计算得分
  - decay functions： 使用算法（如 exp, linear, gauss）针对某个字段的值进行衰退。
score_mode：
- 作用： 如何组合多个函数的得分。可选值包括
  - sum： 所有函数得分相加
  - multiply： 所有函数得分相乘
  - first： 只使用第一个函数的得分
  - max： 取最大得分
  - min： 取最小得分
boost_mode：
- 作用： 定义文档原始得分与函数得分的组合方式。可选值包括
  - replace： 使用函数得分替代原始得分
  - sum： 函数得分 + 原始得分
  - multiply： 函数得分 * 原始得分
  - max： max(函数得分，原始得分)
  - avg： 取均值
  - min： min(函数得分，原始得分)
min_score：
- 作用： 浮点值，只返回得分高于该值的文档
max_boost：
- 描述： 浮点数值，限制文档的最大得分

下面，我将通过具体例子的方式，带大家了解使用

2. field_value_factor 基本使用

2.1 查询例子

写入测试数据

POST /_bulk
{"index": {"_index":"test_19","_id": "1"}}
{"shopName":"elasticsearch","sales_volume": 2134,"score": 3.3}
{"index": {"_index":"test_19","_id": "2"}}
{"shopName":"java","sales_volume": 124912,"score": 4.9}
{"index": {"_index":"test_19","_id": "3"}}
{"shopName":"golang","sales_volume": 34352,"score": 4.2}
{"index": {"_index":"test_19", "_id": "4"}}
{"shopName":"nodejs","sales_volume": 7545,"score": 5.0}
{"index": {"_index":"test_19", "_id": "5"}}
{"shopName":"rust","sales_volume": 31221,"score": 5.0}

查询

GET test_19/_search
{
  "query": {
    "function_score": {
      "query": {
        "match_all": {}
      },
      "functions": [
        {
          "script_score": {
            "script": """
              if (doc['sales_volume'].value < 10) {
                return _score * 0.1
              } else if (doc['sales_volume'].value < 100) {
                return _score * 2
              } else if (doc['sales_volume'].value < 1000) {
                return _score * 3
              } else if (doc['sales_volume'].value < 10000) {
                return _score * 4
              } else {
                return _score * 5
              }
            """
          },
          "weight": 2
        },
        {
          "field_value_factor": {
            "field": "score",
            "factor": 1.5,
            "modifier": "sqrt",
            "missing": 1
          },
          "weight": 3
        }
      ],
      "min_score": 2,
      "max_boost": 3.5,
      "score_mode": "avg",
      "boost_mode": "multiply"
    }
  }
}

2.2 算分解释

score_mode=avg 算出来的得分为：
score1 = (score(script_score) * weight(2) + score(field_value_factor) * weight(3)) / (weight(2) + weight(3))
注意，这里除的是 2 个函数的 weight 相加值，默认 weight = 1
boost_mode=multiply 算出来的文档最终得分为：
最终得分 = score1 * 文档原始得分

其中 field_value_factor 函数的算分如下：
sqrt(doc['score'].value * 1.5)

2.3 参数说明

field： 文档字段
factor： 乘以 field 的因子，默认 1
modifier： 使用的函数，有以下值：none, log, log1p, log2p, ln, ln1p, ln2p, square, sqrt, reciprocal，默认为 none
missing： 当 field 缺失时的默认值

3. random_score 基本使用

3.1 查询例子

GET test_19/_search
{
  "query": {
    "function_score": {
      "query": {
        "match_all": {}
      },
      "functions": [
        {
          "random_score": {
            "seed": 2,
            "field": "shopName.keyword"
          }
        }
      ],
      "boost_mode": "replace"
    }
  } 
}

3.2 参数说明

seed： 随机数种子
field： 不能使用 text 字段，当 seed, field 相同时，同一个查询条件下，随机值会一致

4. weight、script_score

在介绍 field_value_factor 时，已经介绍了 weight 和 script_score 不再过多介绍。

5. 衰退函数

根据某个值，来降低文档的得分。

5.1 gauss

gauss 衰退函数使用正态分布（高斯分布）来计算得分衰退。得分从一个最大值开始，随着距离的增加（或时间的推移）而逐渐降低。

5.1.1 主要参数

origin
- 作用： 确定得分的起点，以该值向两边衰退。该值一般为数字、时间、地理坐标
scale
- 作用： 衰退的范围，决定得分下降的速度。在范围内得分会相对较高
offset（可选）
- 作用： 在偏离 origin 多少 offset 范围内，得分不会降低

5.1.2 查询例子

PUT test_19_decay
{
  "mappings": {
    "properties": {
      "location": {
         "type": "geo_point"
      },
      "name": {
        "type": "keyword"
      },
      "score": {
        "type": "double"
      }
    }
  }
}

PUT /_bulk
{"index": {"_index":"test_19_decay","_id": "1"}}
{"name":"elasticsearch","location": {"lon": -71.34,"lat": 41.12}, "score": 3.2}
{"index": {"_index":"test_19_decay","_id": "2"}}
{"name":"java","location": {"lon": 123.21,"lat": 21.21},"score": 4.2}
{"index": {"_index":"test_19_decay","_id": "3"}}
{"name":"golang","location": {"lon": 73.21,"lat": -22.11},"score": 2.2}
{"index": {"_index":"test_19_decay","_id": "4"}}
{"name":"nodejs","location": {"lon": -98.21,"lat": -44.11},"score": 5.0}
{"index": {"_index":"test_19_decay","_id": "5"}}
{"name":"rust","location": {"lon": 65.12,"lat": 56.23},"score": 1.2}

GET test_19_decay/_search
{
  "query": {
    "function_score": {
      "query": {
        "match_all": {}
      },
      "functions": [
        {
          "gauss": {
            "location": {
              "origin": "56.23,65.12",
              "scale": "10km",
              "offset": "1km"
            }
          },
          "weight": 2
        },
        {
          "script_score": {
            "script": {
              "source": "_score * doc['score'].value"
            }
          }
        }
      ],
      "score_mode": "sum", 
      "boost_mode": "sum"
    }
  }
}

gauss 函数说明
在坐标 (56.23,65.12) 附近 1km 内，得分始终保持为 1。在 10km 内，得分会保持较高。

5.2 exp

exp 使用指数衰减来计算得分，远离 origin 得分会快速降低，适合对距离或时间敏感的，使用上和 gauss 一致。

GET test_19_decay/_search
{
  "query": {
    "function_score": {
      "query": {
        "match_all": {}
      },
      "functions": [
        {
          "exp": {
            "location": {
              "origin": "56.23,65.12",
              "scale": "10km",
              "offset": "1km"
            }
          },
          "weight": 2
        },
        {
          "script_score": {
            "script": {
              "source": "_score * doc['score'].value"
            }
          }
        }
      ],
      "score_mode": "sum", 
      "boost_mode": "sum"
    }
  }
}

5.3 linear

linear 衰退函数使用线性方式调整得分。随着距离的增加，得分按线性方式逐渐减少，适合需要平滑衰减的场景，使用上和 gauss 一致。

GET test_19_decay/_search
{
  "query": {
    "function_score": {
      "query": {
        "match_all": {}
      },
      "functions": [
        {
          "linear": {
            "location": {
              "origin": "56.23,65.12",
              "scale": "10km",
              "offset": "1km"
            }
          },
          "weight": 2
        },
        {
          "script_score": {
            "script": {
              "source": "_score * doc['score'].value"
            }
          }
        }
      ],
      "score_mode": "sum", 
      "boost_mode": "sum"
    }
  }
}

查看全文

http://www.kler.cn/a/330644.html