当前位置：首页 > article >正文

【Elasticsearch】搜索时排序规则

article 2025/2/23 4:54:49

https://www.elastic.co/guide/en/elasticsearch/reference/8.17/sort-search-results.html

它详细介绍了如何在 Elasticsearch 中对搜索结果进行排序，包括按字段排序、地理距离排序、脚本排序、嵌套字段排序等多种排序方式。以下是网页的主要内容总结，特别关注数值型字段排序及其可能存在的问题：

---

1.按字段排序（Field Sorting）

Elasticsearch 允许对一个或多个字段进行排序，支持升序（`asc`）和降序（`desc`）。

示例：

```json

GET /my-index-000001/_search

{

"sort": [

{ "post_date": { "order": "asc" } },

{ "name": { "order": "desc" } }

"query": { "term": { "user": "kimchy" } }

}

```

2.特殊字段排序

• `_score`：按相关性分数排序。

• `_doc`：按索引顺序排序，这是最高效的排序方式，适用于不需要特定排序逻辑的场景。

3.优化排序性能

• 避免对`text`类型字段排序，推荐使用`keyword`或数值类型字段。

• 可以通过索引时的预排序（index sorting）来优化查询时的排序性能。

4.多值字段排序（Multi-Value Fields）

支持多值字段的排序，可以通过`mode`参数选择排序值（如`min`、`max`、`avg`、`sum`、`median`）。

示例：

```json

GET /my-index-000001/_search

{

"sort": [

{ "price": { "order": "asc", "mode": "avg" } }

"query": { "term": { "product": "chocolate" } }

}

```

5.嵌套字段排序（Nested Fields）

支持对嵌套对象内的字段进行排序，需要指定嵌套路径（`nested.path`）。

示例：

```json

GET /my-index-000001/_search

{

"sort": [

{

"offer.price": {

"order": "asc",

"nested": { "path": "offer" }

}

"query": { "term": { "product": "chocolate" } }

}

```

6.地理距离排序（Geo Distance Sorting）

可以根据地理坐标（`geo_point`字段）计算文档与指定点的距离，并按距离排序。

示例：

```json

GET /my-index-000001/_search

{

"sort": [

{

"_geo_distance": {

"pin.location": [-70, 40],

"order": "asc",

"unit": "km"

}

"query": { "term": { "user": "kimchy" } }

}

```

7.自定义脚本排序（Script Sorting）

可以使用 Painless 脚本动态计算排序值。

示例：

```json

GET /my-index-000001/_search

{

"sort": [

{

"_script": {

"type": "number",

"script": {

"lang": "painless",

"source": "doc['field_name'].value * params.factor",

"params": { "factor": 1.1 }

"order": "asc"

}

"query": { "term": { "user": "kimchy" } }

}

```

8.处理缺失字段（Missing Fields）

可以通过`missing`参数指定如何处理缺少排序字段的文档（如`_last`、`_first`或自定义值）。

示例：

```json

GET /my-index-000001/_search

{

"sort": [

{ "price": { "order": "asc", "missing": "_last" } }

"query": { "term": { "product": "chocolate" } }

}

```

9.忽略未映射字段（Ignoring Unmapped Fields）

使用`unmapped_type`参数可以忽略未映射的字段，避免排序失败。

示例：

```json

GET /my-index-000001/_search

{

"sort": [

{ "price": { "order": "asc", "unmapped_type": "long" } }

"query": { "term": { "product": "chocolate" } }

}

```

10.数值型字段排序（Sorting Numeric Fields）

Elasticsearch 支持对数值字段进行排序，并可以通过`numeric_type`参数将不同类型的数值字段统一为一种类型，以便在多索引查询中进行排序。

示例：

```json

GET /index_long,index_double/_search

{

"sort": [

{

"field": {

"order": "asc",

"numeric_type": "double"

}

"query": { "term": { "product": "chocolate" } }

}

```

存在的问题：

• 字段类型不一致：如果同一个字段在不同索引中被映射为不同的类型（如`integer`和`long`），直接排序会导致错误。需要使用`numeric_type`强制统一类型。

• 性能问题：类型转换会增加额外的计算开销，尤其是在处理大量数据时。

• 溢出问题：将日期字段转换为`date_nanos`时，可能会因为超出范围而导致溢出（例如，1970 年之前或 2262 年之后的日期）。

11.多索引排序（Multi-Index Sorting）

在多索引查询中，如果字段在不同索引中的映射不一致，需要使用`numeric_type`参数来统一字段类型，以便进行排序。

示例：

```json

GET /index_long,index_double/_search

{

"sort": [

{

"field": {

"order": "asc",

"numeric_type": "double"

}

"query": { "term": { "product": "chocolate" } }

}

```

12.保留分数（Track Scores）

默认情况下，按字段排序时不会计算相关性分数。可以通过设置`track_scores`为`true`来保留分数。

示例：

```json

GET /my-index-000001/_search

{

"track_scores": true,

"sort": [

{ "post_date": { "order": "desc" } }

"query": { "term": { "user": "kimchy" } }

}

```

13.内存考虑（Memory Considerations）

排序时会将相关字段值加载到内存中，因此需要确保每个分片有足够的内存。对于字符串类型字段，建议使用未分词的字段（如`keyword`）进行排序。

---

总结

该网页详细介绍了 Elasticsearch 的排序功能，包括按字段排序、地理距离排序、自定义脚本排序、嵌套字段排序等。特别地，对于数值型字段排序，文档提到了在多索引查询中可能遇到的字段类型不一致问题，并提供了使用`numeric_type`参数来解决这一问题的方法。同时，文档也指出了类型转换可能带来的性能和溢出问题。

查看全文

http://www.kler.cn/a/555119.html