ElasticSearch进阶
目录
两种检索方式
Query DSL
match_all
match
match_phrase
multi_match
bool
filter
term & .keyword
aggregations
两种检索方式
URL+检索参数
GET /bank/_search?q=*&sort=account_number:asc
URL+请求体
GET /bank/_search { "query": { "match_all": {} }, "sort": [ { "account_number": "asc" } ] }
hits:检索结果
hits.hits --搜索结果数组
Query DSL
Domain Specific Language ——是Elasticsearch中用于构建复杂查询的JSON格式语言。
基本结构
{
QUERY_NAME:{
ARGUMENT:VALUE,
ARGUMENT:VALUE,...
}
}
针对某字段时
{
QUERY_NAME:{
FIELD_NAME:{
ARGUMENT:VALUE,
ARGUMENT:VALUE,...
}
}
}
match_all
GET /bank/_search { "query": { "match_all": {} }, "sort": [ { "account_number": { "order": "asc" } } ], "from": 0, "size": 5, "_source": ["account_number","balance"] }
match
仅支持单个字段feild,全文检索,分词匹配,倒排索引
GET /bank/_search
{
"query": {
"match": {
"address": "mill lane"
}
},
"_source": ["account_number","address"]
}
match_phrase
短语匹配,不分词,将检索条件当作一个完整的单词
GET /bank/_search
{
"query": {
"match_phrase": {
"address": "mill lane"
}
},
"_source": ["account_number","address"]
}
multi_match
多字段匹配,分词
GET /bank/_search
{
"query": {
"multi_match": {
"query": "mill Movico",
"fields": ["address","city"]
}
},
"_source": ["account_number","address","city"]
}
bool
复合查询,可以合并其他查询语句
GET /bank/_search
{
"query": {
"bool": {
"must": [
{"match": {
"gender": "M"
}},
{"match": {
"address": "mill"
}}
],
"must_not": [
{"match": {
"age": "28"
}}
],
"should": [
{"match": {
"firstname": "winnie"
}}
]
}
}
}
must:必须符合列举的所有条件
must_not:必须不符合
should:可以符合也可以不符合 列举的条件---影响相关性得分
filter
不产生分数的查询条件,相当于 不加分的must
GET /bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"gender": "M"
}
},
{
"match": {
"address": "mill"
}
}
],
"must_not": [
{
"match": {
"age": "28"
}
}
],
"should": [
{
"match": {
"firstname": "winnie"
}
}
],
"filter": {
"range": {
"balance": {
"gte": 40000,
"lte": 50000
}
}
}
}
}
}
filter引入后,对比引入前,命中结果减少,但相关性得分不变
term & .keyword
精确匹配,直接匹配字段的原始值,不进行任何分词或分析。
适用于非文本字段,比match稍快
“由于ES在保存text字段时,会进行分词,用term去精确匹配一个完整text是非常困难的”
非文本字段用term
GET /bank/_search
{
"query": {
"term": {
"account_number": {
"value": "136"
}
}
}
}
文本字段的精确匹配用 .keyword
GET /bank/_search
{
"query": {
"match": {
"address.keyword": "198 Mill Lane"
}
}
}
}
aggregations
执行聚合,用于对数据进行统计分析和分组。类似于 SQL 中的 GROUP BY
和聚合函数(如 SUM
、AVG
、COUNT
等)。
- Bucket Aggregations(桶聚合),将doc分到不同的桶中,每个桶代表一个分组。
- Metric Aggregations(指标聚合),统计,如总和、平均值、最大值、最小值等。
- Pipeline Aggregations(管道聚合),对其他聚合的结果进行二次计算。
GET /bank/_search
{
"query": {
"match_all": {}
},
"size": 0,
"aggs": {
"balanceAvg":{
"avg": {
"field": "balance"
}
},
"ageAgg": {
"terms": {
"field": "age",
"size": 10
},
"aggs": {
"balanceAvg":{
"avg": {
"field": "balance"
}
},
"genderAgg": {
"terms": {
"field": "gender.keyword",
"size": 10
},
"aggs": {
"balanceAvg": {
"avg": {
"field": "balance"
}
}
}
}
}
}
}
}
terms:桶聚合,分组
avg:指标聚合,统计平均值
计算所有员工的balance平均值,
先依据年龄分组,按年龄计算balance平均值,
再嵌套性别分组,年龄性别分组后计算balance平均值