PostgreSQL日志中的SQL记录时机 —— log_statement 和 log_min_duration_statement
最近跟朋友讨论到PostgreSQL日志中的SQL记录时机,研究了下log_statement 和 log_min_duration_statement两个参数,记录一下。
一、 参数简介
1. log_statement
① 作用
控制记录SQL的类型,可选值为:
- none:关闭(默认)
- ddl:DDL语句
- mod:DDL和所有涉及数据修改的语句(DML、COPY FROM、PREPARE、EXECUTE等)。对于explain和explain analyze,如果后面的语句类型符合,也会被记录
- all:所有语句
② 记录时机
SQL语句解析成功后,执行前。因此即使设置为all,也不会记录有语法错误的语句(如果想记录,应该使用log_min_error_statement参数)。
③ 记录内容
记录SQL语句,包含参数,但不包含执行用户、主机名等信息,这些需要审计插件才有。
2. log_min_duration_statement
① 作用
记录超过指定执行时间阈值的SQL,可选值为:
- -1:关闭(默认)
- 0:所有语句
- 正数:慢SQL阈值
② 记录时机
SQL语句执行完成后,因此能记录到执行时间。
③ 记录内容
- 记录SQL语句、执行时间,但不包含参数。
- 对于使用扩展查询协议的客户端,对语法分析、绑定、执行每一步所花时间会独立记录。
3. 同时符合两者的SQL会如何?
- 语句在解析完成后、执行开始前,即被记入日志(log_statement生效)
- 语句执行完成后,单独将duration记入日志(log_min_duration_statement生效),但不再重复记录语句
- 因此建议使用log_line_prefix记录PID或会话ID,避免duration和语句关联不上
二、 效果测试
1. 两者均设置
log_statement='all',log_min_duration_statement='0s'
2023-11-28 17:51:45.222 CST [2484] LOG: statement: select pg_sleep(10);
2023-11-28 17:51:55.227 CST [2484] LOG: duration: 10005.249 ms
语句记录为开始时间(log_statement生效),duration在执行完成时单独记录(log_min_duration_statement生效),但不再重复记录语句。
2. 仅设置log_statement
log_statement='all',log_min_duration_statement='-1'(禁用)
2023-11-28 17:53:49.540 CST [2760] LOG: statement: select pg_sleep(10);
仅记录语句,记录时间为开始时间,没有duration
3. 仅设置log_min_duration_statement
log_statement='none'(禁用),log_min_duration_statement='0s'
2023-11-28 17:55:20.288 CST [2826] LOG: duration: 15015.447 ms statement: select pg_sleep(15);
记录语句和duration,记录时间为结束时间
三、 参数记录时机
从上面文档可以知道,两个参数记录时机都在SQL执行阶段,只是一个在前一个在后。源码中SQL执行相关的函数主要是exec_simple_query,因此我们就看看这个函数。
函数刚开头就可以看到 pg_parse_query 和 check_log_statement函数。
/*
* exec_simple_query
*
* Execute a "simple Query" protocol message.
*/
static void
exec_simple_query(const char *query_string)
{
CommandDest dest = whereToSendOutput;
MemoryContext oldcontext;
List *parsetree_list;
ListCell *parsetree_item;
bool save_log_statement_stats = log_statement_stats;
bool was_logged = false;
bool use_implicit_block;
char msec_str[32];
...
/*
* Do basic parsing of the query or queries (this should be safe even if
* we are in aborted transaction state!)
*/
parsetree_list = pg_parse_query(query_string);
/* Log immediately if dictated by log_statement */
if (check_log_statement(parsetree_list))
{
ereport(LOG,
(errmsg("statement: %s", query_string),
errhidestmt(true),
errdetail_execute(parsetree_list)));
was_logged = true;
}
- pg_parse_query函数用于SQL解析,符合log_statement记录的语句发生在解析完成后
- check_log_statement函数就用于检查log_statement的设置,标记是否需要记录
- was_logged=true表示已记录SQL语句,主要是给后面的慢SQL记录函数,提示其不需重复记录SQL文本
/*
* check_log_statement
* Determine whether command should be logged because of log_statement
*
* stmt_list can be either raw grammar output or a list of planned
* statements
*/
static bool
check_log_statement(List *stmt_list)
{
ListCell *stmt_item;
if (log_statement == LOGSTMT_NONE)
return false;
if (log_statement == LOGSTMT_ALL)
return true;
/* Else we have to inspect the statement(s) to see whether to log */
foreach(stmt_item, stmt_list)
{
Node *stmt = (Node *) lfirst(stmt_item);
if (GetCommandLogLevel(stmt) <= log_statement)
return true;
}
return false;
}
后面一大堆是SQL执行相关的代码,非本次重点,直接拉到函数末尾,可以看到check_log_duration函数,看名字都能猜到是它了,也符合在SQL执行完才记录。
...
/*
* Emit duration logging if appropriate.
*/
switch (check_log_duration(msec_str, was_logged))
{
case 1:
ereport(LOG,
(errmsg("duration: %s ms", msec_str),
errhidestmt(true)));
break;
case 2:
ereport(LOG,
(errmsg("duration: %s ms statement: %s",
msec_str, query_string),
errhidestmt(true),
errdetail_execute(parsetree_list)));
break;
}
if (save_log_statement_stats)
ShowUsage("QUERY STATISTICS");
TRACE_POSTGRESQL_QUERY_DONE(query_string);
debug_query_string = NULL;
}
check_log_duration函数
可以看到注释中给出了上面case 1,2的含义:
- 1:仅记录duration,即前面提到的两个参数均生效的场景
- 2:记录duration和语句:即前面提到的仅log_min_duration_statement生效的场景
另外是否记录慢sql与 log_duration、log_min_duration_sample 这些参数也有关。
/*
* check_log_duration
* Determine whether current command's duration should be logged
* We also check if this statement in this transaction must be logged
* (regardless of its duration).
*
* Returns:
* 0 if no logging is needed
* 1 if just the duration should be logged
* 2 if duration and query details should be logged
*
* If logging is needed, the duration in msec is formatted into msec_str[],
* which must be a 32-byte buffer.
*
* was_logged should be true if caller already logged query details (this
* essentially prevents 2 from being returned).
*/
int
check_log_duration(char *msec_str, bool was_logged)
{
if (log_duration || log_min_duration_sample >= 0 ||
log_min_duration_statement >= 0 || xact_is_sampled)
{
long secs;
int usecs;
int msecs;
bool exceeded_duration;
bool exceeded_sample_duration;
bool in_sample = false;
TimestampDifference(GetCurrentStatementStartTimestamp(),
GetCurrentTimestamp(),
&secs, &usecs);
msecs = usecs / 1000;
/*
* This odd-looking test for log_min_duration_* being exceeded is
* designed to avoid integer overflow with very long durations: don't
* compute secs * 1000 until we've verified it will fit in int.
*/
exceeded_duration = (log_min_duration_statement == 0 ||
(log_min_duration_statement > 0 &&
(secs > log_min_duration_statement / 1000 ||
secs * 1000 + msecs >= log_min_duration_statement)));
exceeded_sample_duration = (log_min_duration_sample == 0 ||
(log_min_duration_sample > 0 &&
(secs > log_min_duration_sample / 1000 ||
secs * 1000 + msecs >= log_min_duration_sample)));
/*
* Do not log if log_statement_sample_rate = 0. Log a sample if
* log_statement_sample_rate <= 1 and avoid unnecessary random() call
* if log_statement_sample_rate = 1.
*/
if (exceeded_sample_duration)
in_sample = log_statement_sample_rate != 0 &&
(log_statement_sample_rate == 1 ||
random() <= log_statement_sample_rate * MAX_RANDOM_VALUE);
if (exceeded_duration || in_sample || log_duration || xact_is_sampled)
{
snprintf(msec_str, 32, "%ld.%03d",
secs * 1000 + msecs, usecs % 1000);
if ((exceeded_duration || in_sample || xact_is_sampled) && !was_logged)
return 2;
else
return 1;
}
}
return 0;
}
参考:
https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-STATEMENT
https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY
https://blog.csdn.net/qq_35423190/article/details/129138740
https://iwmj.wordpress.com/2018/04/03/postgresql-%E5%89%8D%E5%90%8E%E7%AB%AF%E5%8D%8F%E8%AE%AE%E4%B8%AD%E7%9A%84%E6%9F%A5%E8%AF%A2%E6%96%B9%E5%BC%8F%EF%BC%9A%E7%AE%80%E5%8D%95%E6%9F%A5%E8%AF%A2%E3%80%81%E6%89%A9%E5%B1%95%E6%9F%A5%E8%AF%A2/