【MySQL】ROW_NUMBER 窗口函数妙用之报告系统状态的连续日期
力扣题
1、题目地址
1225. 报告系统状态的连续日期
2、模拟表
表:Failed
Column Name | Type |
---|---|
fail_date | date |
- 该表主键为 fail_date (具有唯一值的列)。
- 该表包含失败任务的天数.
表: Succeeded
Column Name | Type |
---|---|
success_date | date |
- 该表主键为 success_date (具有唯一值的列)。
- 该表包含成功任务的天数.
3、要求
系统 每天 运行一个任务。每个任务都独立于先前的任务。任务的状态可以是失败或是成功。
编写解决方案找出 2019-01-01 到 2019-12-31 期间任务连续同状态 period_state 的起止日期(start_date 和 end_date)。
即如果任务失败了,就是失败状态的起止日期,如果任务成功了,就是成功状态的起止日期。
最后结果按照起始日期 start_date 排序。
4、示例
输入:
Failed 表:
fail_date |
---|
2018-12-28 |
2018-12-29 |
2019-01-04 |
2019-01-05 |
Succeeded 表:
success_date |
---|
2018-12-30 |
2018-12-31 |
2019-01-01 |
2019-01-02 |
2019-01-03 |
2019-01-06 |
输出:
period_state | start_date | end_date |
---|---|---|
succeeded | 2019-01-01 | 2019-01-03 |
failed | 2019-01-04 | 2019-01-05 |
succeeded | 2019-01-06 | 2019-01-06 |
解释:
结果忽略了 2018 年的记录,因为我们只关心从 2019-01-01 到 2019-12-31 的记录
从 2019-01-01 到 2019-01-03 所有任务成功,系统状态为 “succeeded”。
从 2019-01-04 到 2019-01-05 所有任务失败,系统状态为 “failed”。
从 2019-01-06 到 2019-01-06 所有任务成功,系统状态为 “succeeded”。
5、代码编写
网友写法
SELECT state AS period_state,
MIN(date) AS start_date,
MAX(date) AS end_date
FROM (
SELECT *,
row_number() over (partition by state order by date asc) AS rk1,
row_number() over (order by date asc) AS rk2
FROM (
SELECT fail_date AS 'date', 'failed' AS state FROM failed
UNION ALL
SELECT success_date, 'succeeded' FROM succeeded
) t
) t2
WHERE date BETWEEN '2019-01-01' AND '2019-12-31'
GROUP BY state, rk2-rk1
代码分析
第一步,将两个表数据查询出来并标上状态,方便后面用窗口函数处理
SELECT fail_date AS 'date', 'failed' AS state FROM failed
UNION ALL
SELECT success_date, 'succeeded' FROM succeeded
date | state |
---|---|
2018-12-28 | failed |
2018-12-29 | failed |
2019-01-04 | failed |
2019-01-05 | failed |
2018-12-30 | succeeded |
2018-12-31 | succeeded |
2019-01-01 | succeeded |
2019-01-02 | succeeded |
2019-01-03 | succeeded |
2019-01-06 | succeeded |
第二步,第一个 row_number 根据状态的不同,分组后按日期正序进行排序处理,第二个 row_number 很明显是按日期正序,后面使用
SELECT *,
row_number() over (partition by state order by date asc) AS rk1,
row_number() over (order by date asc) AS rk2
FROM (
SELECT fail_date AS 'date', 'failed' AS state FROM failed
UNION ALL
SELECT success_date, 'succeeded' FROM succeeded
) t
date | state | rk1 | rk2 |
---|---|---|---|
2018-12-28 | failed | 1 | 1 |
2018-12-29 | failed | 2 | 2 |
2018-12-30 | succeeded | 1 | 3 |
2018-12-31 | succeeded | 2 | 4 |
2019-01-01 | succeeded | 3 | 5 |
2019-01-02 | succeeded | 4 | 6 |
2019-01-03 | succeeded | 5 | 7 |
2019-01-04 | failed | 3 | 8 |
2019-01-05 | failed | 4 | 9 |
2019-01-06 | succeeded | 6 | 10 |
第三步,首先根据状态分组能做状态区分,第二点是关键,rk2-rk1,因为rk2是连续的,所以很清楚就能知道 rk2-rk1 如果是相等的那必定是连续的,但是可能会碰到 state 不同,rk2-rk1 相同的情况(前面先 group by state 就是首先排除这种情况)
SELECT state AS period_state,
MIN(date) AS start_date,
MAX(date) AS end_date
FROM (
SELECT *,
row_number() over (partition by state order by date asc) AS rk1,
row_number() over (order by date asc) AS rk2
FROM (
SELECT fail_date AS 'date', 'failed' AS state FROM failed
UNION ALL
SELECT success_date, 'succeeded' FROM succeeded
) t
) t2
WHERE date BETWEEN '2019-01-01' AND '2019-12-31'
GROUP BY state, rk2-rk1
period_state | start_date | end_date |
---|---|---|
succeeded | 2019-01-01 | 2019-01-03 |
failed | 2019-01-04 | 2019-01-05 |
succeeded | 2019-01-06 | 2019-01-06 |