POST /my_movies/_doc/1
{"title":"Speed","actors":[
{"first_name":"Keanu","last_name":"Reeves"},
{"first_name":"Dennis","last_name":"Hopper"}]}
# 会搜到不需要的结果,因为 JSON 格式被处理成扁平式键值对的结构
POST /my_movies/_search
{"query":{"bool":{"must":[
{"match":{"actors.first_name":"Keanu"}},
{"match":{"actors.last_name":"Hopper"}}]}}}
嵌套对象 (Nested Object)
Nested Data Type
Nested数据类型: 允许对象数组中的对象被独立索引
使用nested 和properties 关键字,将所有actors索引到多个分隔的文档
在内部, Nested文档会被保存在两个Lucene文档中,在查询时做Join处理
# 创建 Nested 对象 Mapping
PUT /my_movies
{"mappings":{"properties":{
"actors":{"type":"nested","properties":{
"first_name":{"type":"keyword"},
"last_name":{"type":"keyword"}}},
"title":{"type":"text",
"fields":{"keyword":{"type":"keyword","ignore_above":256}}}}}}
POST /my_movies/_doc/1
{"title":"Speed","actors":[
{"first_name":"Keanu","last_name":"Reeves"},
{"first_name":"Dennis","last_name":"Hopper"}]}
# Nested 查询
POST /my_movies/_search
{"query":{"bool":{"must":[
{"match":{"title": "Speed"}},
{"nested":{"path":"actors","query":{"bool":{"must":[
{"match":{"actors.first_name":"Keanu"}},
{"match":{"actors.last_name":"Hopper"}}]}}}}]}}}
# Nested Aggregation
POST /my_movies/_search
{"size":0,"aggs":{"actors":{
"nested":{"path":"actors"},
"aggs":{"actor_name":{"terms":{
"field":"actors.first_name","size":10}}}}}}
# 普通 aggregation 不工作
POST /my_movies/_search
{"size":0,"aggs":{
"NAME":{"terms":{
"field":"actors.first_name","size":10}}}}
父子关联关系 (Parent / Child)
对象和Nested对象的局限性: 每次更新,可能需要重新索引整个对象 (包括根对象和嵌套对象)
ES提供了类似关系型数据库中Join 的实现
使用Join数据类型实现,可以通过维护Parent/ Child的关系,从而分离两个对象
父文档和子文档是两个独立的文档
更新父文档无需重新索引子文档
子文档被添加,更新或者删除也不会影响到父文档和其他的子文档
注意
父文档和子文档必须存在相同的分片上,能够确保查询 join 的性能
当指定子文档时候,必须指定它的父文档ld。使用routing参数来保证,分配到相同的分片
# 设定 Parent/Child Mapping
PUT /my_blogs
{"settings":{"number_of_shards":2},
"mappings":{"properties":{
"blog_comments_relation":{"relations":{"blog":"comment"},
"type":"join"},
"content":{"type":"text"},
"title":{"type":"keyword"}}}}
# 索引父文档
PUT /my_blogs/_doc/blog1
{"title":"Learning Elasticsearch",
"content":"learning ELK ",
"blog_comments_relation":{"name":"blog"}}
# 索引子文档
PUT /my_blogs/_doc/comment1?routing=blog1
{"comment":"I am learning ELK",
"username":"Jack",
"blog_comments_relation":{"name":"comment","parent":"blog1"}}
# Parent Id 查询
POST /my_blogs/_search
{"query":{"parent_id":{"type":"comment","id":"blog2"}}}
# Has Child 查询, 返回父文档
POST /my_blogs/_search
{"query":{"has_child":{
"type":"comment","query":{"match":{"username":"Jack"}}}}}
# Has Parent 查询, 返回相关的子文档
POST /my_blogs/_search
{"query":{"has_parent":{
"parent_type":"blog","query":{"match":{"title":"Learning Hadoop"}}}}}
# 通过 ID, 访问子文档
GET /my_blogs/_doc/comment3
# 通过 ID 和 routing, 访问子文档
GET /my_blogs/_doc/comment3?routing=blog2
# 更新子文档
PUT /my_blogs/_doc/comment3?routing=blog2
{"comment":"Hello Hadoop??",
"blog_comments_relation":{"name":"comment","parent":"blog2"}}