【ES实战】Elasticsearch中Task的简单管理说明
Elasticsearch中Task的简单管理说明
以下命令,在ES5和ES6版本中均是可以使用的
文章目录
- Elasticsearch中Task的简单管理说明
- 查询task
- 示例
- 取消任务
- 取消任务某个特定的task
- 示例
- 取消某些节点上的task
- 示例
- 使用场景
- 彩蛋
- ES的审计日志的配置策略
查询task
参数名称 | 是否必填 | 类型 | 备注 |
---|---|---|---|
actions | NO | String | 用于限制请求的操作的逗号分隔列表或通配符表达式,支持通配符* |
detailed | NO | Boolean | 若是true,返回的包含有关分片恢复的详细信息 |
group_by | NO | string | 可取值:nodes ,parents ,none ,用于对返回结果中的任务进行分组的关键字,默认是nodes ,根据节点分组归集 |
nodes | NO | string | 限制查询的节点,值为节点id或名称的逗号分隔列表。 |
parent_task_id | NO | string | 用于查询固定父任务ID的任务信息。 默认返回全部任务,若值为 -1 也是返回全部任务。 |
timeout | NO | 数字与支持的时间单位组合 | 等待每个节点响应的时间。如果节点在超时之前没有响应,则响应中不包含其信息。但是,超时节点包含在响应的node_failures 属性中。默认为30s 。 |
wait_for_completion | NO | Boolean | 请求阻塞等待,直到所有找到的任务都完成。默认为false 。 |
示例
GET _tasks?&detailed&actions=*data/read/search
返回
{
"nodes": {
"kCTg3PK6QhOI8tTH1Qvu7g": {
"name": "bdes222-prd1",
"transport_address": "10.96.120.112:8201",
"host": "bdes222-prd1.cnsuning.com",
"ip": "10.96.120.112:8201",
"roles": [
"data",
"ingest"
],
"attributes": {
"ml.enabled": "true"
},
"tasks": {
"kCTg3PK6QhOI8tTH1Qvu7g:3369203332": {
"node": "kCTg3PK6QhOI8tTH1Qvu7g",
"id": 3369203332,
"type": "transport",
"action": "indices:data/read/search",
"description": "indices[budsin_and_nbillingin_error_topic_index], types[budsin_and_nbillingin_error_topic_type], search_type[QUERY_THEN_FETCH], source[{\"size\":1000,\"query\":{\"bool\":{\"must\":[{\"match\":{\"orderItemStatus\":{\"query\":\"01\",\"operator\":\"OR\",\"prefix_length\":0,\"max_expansions\":50,\"fuzzy_transpositions\":true,\"lenient\":false,\"zero_terms_query\":\"NONE\",\"boost\":1.0}}},{\"match\":{\"systemName\":{\"query\":\"budsin\",\"operator\":\"OR\",\"prefix_length\":0,\"max_expansions\":50,\"fuzzy_transpositions\":true,\"lenient\":false,\"zero_terms_query\":\"NONE\",\"boost\":1.0}}},{\"match\":{\"orderItemId\":{\"query\":\"210100023293571005\",\"operator\":\"OR\",\"prefix_length\":0,\"max_expansions\":50,\"fuzzy_transpositions\":true,\"lenient\":false,\"zero_terms_query\":\"NONE\",\"boost\":1.0}}},{\"match\":{\"statusName\":{\"query\":\"FQ\",\"operator\":\"OR\",\"prefix_length\":0,\"max_expansions\":50,\"fuzzy_transpositions\":true,\"lenient\":false,\"zero_terms_query\":\"NONE\",\"boost\":1.0}}},{\"match\":{\"serialNo\":{\"query\":\"1\",\"operator\":\"OR\",\"prefix_length\":0,\"max_expansions\":50,\"fuzzy_transpositions\":true,\"lenient\":false,\"zero_terms_query\":\"NONE\",\"boost\":1.0}}}],\"disable_coord\":false,\"adjust_pure_negative\":true,\"boost\":1.0}},\"version\":true,\"_source\":false,\"sort\":[{\"_doc\":{\"order\":\"asc\"}}]}]",
"start_time_in_millis": 1724323962569,
"running_time_in_nanos": 26397377,
"cancellable": true,
"parent_task_id": "kCTg3PK6QhOI8tTH1Qvu7g:3369203331"
}
}
}
}
}
返回结果标识parent_task_id
代表其父任务id。cancellable
代表任务是否可以取消。description
里面是一些详细的说明,例如查询语句。
取消任务
取消任务某个特定的task
POST _tasks/<taskid>/_cancel
其中<taskid>
使用的是parent_task_id
的值。
示例
POST _tasks/kCTg3PK6QhOI8tTH1Qvu7g:3369203331/_cancel
取消某些节点上的task
POST _tasks/_cancel?nodes=<nodeId1,nodeId2>&actions=*data/read/search
<nodeId1,nodeId2>
为节点id或名称的逗号分隔列表。
示例
停止节点上的全部search类任务
POST _tasks/_cancel?nodes=kCTg3PK6QhOI8tTH1Qvu7g&actions=*data/read/search
使用场景
-
一些大查询导致,或者一些索引别名查询的索引数量太多,所需的内存太大。由于7版本之前熔断器不是很灵敏,会导致节点内存使用过高,响应过慢,最终出现OOM节点不可用。当单节点出现慢反应时,会拖慢整个集群的服务能力,导致集群的某些服务不可用。
解决方案:及时重启问题节点(存在丢数据的风险),考虑在管理平台或者业务侧感知,跨索引数量较多的查询,及时告警,然后通过task管理功能,取消此类查询,避免最初就直接采用重启节点的方式。
彩蛋
ES的审计日志的配置策略
主要调整
- 将每天生成的日志文件变成无限个数,默认是7个。
appender.audit_rolling.strategy.fileIndex = nomax
- 删除策略改成混合式,保存30天或者10G以内。
appender.audit_rolling.strategy.action.condition.nested_condition.type = IfAny appender.audit_rolling.strategy.action.condition.nested_condition.lastModify.type = IfLastModified appender.audit_rolling.strategy.action.condition.nested_condition.lastModify.age = 30D appender.audit_rolling.strategy.action.condition.nested_condition.fileSize.type = IfAccumulatedFileSize appender.audit_rolling.strategy.action.condition.nested_condition.fileSize.exceeds = 10GB
示例:
appender.audit_rolling.type = RollingFile
appender.audit_rolling.name = audit_rolling
appender.audit_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${role_name}_audit.log
appender.audit_rolling.layout.type = PatternLayout
appender.audit_rolling.layout.pattern = {\
"@timestamp":"%d{ISO8601}"\
%varsNotEmpty{, "node.name":"%enc{%map{node.name}}{JSON}"}\
%varsNotEmpty{, "node.id":"%enc{%map{node.id}}{JSON}"}\
%varsNotEmpty{, "host.name":"%enc{%map{host.name}}{JSON}"}\
%varsNotEmpty{, "host.ip":"%enc{%map{host.ip}}{JSON}"}\
%varsNotEmpty{, "event.type":"%enc{%map{event.type}}{JSON}"}\
%varsNotEmpty{, "event.action":"%enc{%map{event.action}}{JSON}"}\
%varsNotEmpty{, "user.name":"%enc{%map{user.name}}{JSON}"}\
%varsNotEmpty{, "user.run_by.name":"%enc{%map{user.run_by.name}}{JSON}"}\
%varsNotEmpty{, "user.run_as.name":"%enc{%map{user.run_as.name}}{JSON}"}\
%varsNotEmpty{, "user.realm":"%enc{%map{user.realm}}{JSON}"}\
%varsNotEmpty{, "user.run_by.realm":"%enc{%map{user.run_by.realm}}{JSON}"}\
%varsNotEmpty{, "user.run_as.realm":"%enc{%map{user.run_as.realm}}{JSON}"}\
%varsNotEmpty{, "user.roles":%map{user.roles}}\
%varsNotEmpty{, "origin.type":"%enc{%map{origin.type}}{JSON}"}\
%varsNotEmpty{, "origin.address":"%enc{%map{origin.address}}{JSON}"}\
%varsNotEmpty{, "realm":"%enc{%map{realm}}{JSON}"}\
%varsNotEmpty{, "url.path":"%enc{%map{url.path}}{JSON}"}\
%varsNotEmpty{, "url.query":"%enc{%map{url.query}}{JSON}"}\
%varsNotEmpty{, "request.method":"%enc{%map{request.method}}{JSON}"}\
%varsNotEmpty{, "request.body":"%enc{%map{request.body}}{JSON}"}\
%varsNotEmpty{, "request.id":"%enc{%map{request.id}}{JSON}"}\
%varsNotEmpty{, "action":"%enc{%map{action}}{JSON}"}\
%varsNotEmpty{, "request.name":"%enc{%map{request.name}}{JSON}"}\
%varsNotEmpty{, "indices":%map{indices}}\
%varsNotEmpty{, "opaque_id":"%enc{%map{opaque_id}}{JSON}"}\
%varsNotEmpty{, "x_forwarded_for":"%enc{%map{x_forwarded_for}}{JSON}"}\
%varsNotEmpty{, "transport.profile":"%enc{%map{transport.profile}}{JSON}"}\
%varsNotEmpty{, "rule":"%enc{%map{rule}}{JSON}"}\
%varsNotEmpty{, "event.category":"%enc{%map{event.category}}{JSON}"}\
}%n
appender.audit_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${role_name}_audit-%d{yyyy-MM-dd}-%i.log.gz
appender.audit_rolling.policies.type = Policies
appender.audit_rolling.policies.time.type = TimeBasedTriggeringPolicy
appender.audit_rolling.policies.time.interval = 1
appender.audit_rolling.policies.time.modulate = true
appender.audit_rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.audit_rolling.policies.size.size = 1024GB
appender.audit_rolling.strategy.type = DefaultRolloverStrategy
appender.audit_rolling.strategy.fileIndex = nomax
appender.audit_rolling.strategy.action.type = Delete
appender.audit_rolling.strategy.action.basepath = ${sys:es.logs.base_path}
appender.audit_rolling.strategy.action.condition.type = IfFileName
appender.audit_rolling.strategy.action.condition.glob = ${role_name}_audit-*
appender.audit_rolling.strategy.action.condition.nested_condition.type = IfAny
appender.audit_rolling.strategy.action.condition.nested_condition.lastModify.type = IfLastModified
appender.audit_rolling.strategy.action.condition.nested_condition.lastModify.age = 30D
appender.audit_rolling.strategy.action.condition.nested_condition.fileSize.type = IfAccumulatedFileSize
appender.audit_rolling.strategy.action.condition.nested_condition.fileSize.exceeds = 10GB