中级练习[10]:Hive SQL
目录
1. 各品类销量前三的所有商品
1.1 题目需求
1.2 代码实现
2. 各品类中商品价格的中位数
2.1 题目需求
2.2 代码实现
3. 找出销售额连续3天超过100的商品
3.1 题目需求
3.2 代码实现
1. 各品类销量前三的所有商品
1.1 题目需求
从订单详情表中(order_detail)和商品表(sku_info)中查询各个品类销售数量前三的商品。如果该品类小于三个商品,则输出所有的商品销量。
Sku_id(商品id) | Category_id(品类id) |
---|---|
2 | 1 |
4 | 1 |
1 | 1 |
8 | 2 |
7 | 2 |
5 | 2 |
12 | 3 |
11 | 3 |
10 | 3 |
1.2 代码实现
hive>
select
t2.sku_id,
t2.category_id
from
(
select
t1.sku_id,
si.category_id,
rank() over(partition by category_id order by t1.sku_sum desc) rk
from
(
select
sku_id,
sum(sku_num) sku_sum
from
order_detail
group by
sku_id
)t1
join
sku_info si
on
t1.sku_id = si.sku_id
)t2
where
t2.rk <= 3;
2. 各品类中商品价格的中位数
2.1 题目需求
从商品表(sku_info)中求出每个品类的价格中位数。如果是偶数则输出中间两个值的平均值,如果是奇数,则输出中间数即可。
Category_id(品类id) | Medprice(中位数) |
---|---|
1 | 3500.0 |
2 | 1250.0 |
3 | 510.0 |
2.2 代码实现
hive>
-- 求每个品类价格排序后的商品数量,并标记奇偶性
select
sku_id,
category_id,
price,
row_number() over(partition by category_id order by price desc) rk,
count(*) over(partition by category_id) cn,
count(*) over(partition by category_id) % 2 flag
from
sku_info t1
-- 求出偶数品类的中位数
select
distinct t1.category_id,
avg(t1.price) over(partition by t1.category_id) medprice
from
(
select
sku_id,
category_id,
price,
row_number() over(partition by category_id order by price desc) rk,
count(*) over(partition by category_id) cn,
count(*) over(partition by category_id) % 2 flag
from
sku_info
)t1
where
t1.flag = 0 and (t1.rk = cn / 2 or t1.rk = cn / 2 + 1)
-- 求出奇数品类的中位数
union all
select
t1.category_id,
t1.price
from
(
select
sku_id,
category_id,
price,
row_number() over(partition by category_id order by price desc) rk,
count(*) over(partition by category_id) cn,
count(*) over(partition by category_id) % 2 flag
from
sku_info
)t1
where
t1.flag = 1 and t1.rk = round(cn / 2)
3. 找出销售额连续3天超过100的商品
3.1 题目需求
从订单详情表(order_detail)中找出销售额连续3天超过100的商品。
Sku_id(商品id) |
---|
1 |
10 |
11 |
12 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
3.2 代码实现
hive>
-- 每个商品每天的销售总额
select
sku_id,
create_date,
sum(price * sku_num) sku_sum
from
order_detail
group by
sku_id, create_date
having
sku_sum >= 100
-- 判断连续三天以上
select
distinct t3.sku_id
from
(
select
t2.sku_id,
count(*) over(partition by t2.sku_id, t2.date_drk) cdrk
from
(
select
t1.sku_id,
t1.create_date,
date_sub(t1.create_date, rank() over(partition by t1.sku_id order by t1.create_date)) date_drk
from
(
select
sku_id,
create_date,
sum(price * sku_num) sku_sum
from
order_detail
group by
sku_id, create_date
having
sku_sum >= 100
)t1
)t2
)t3
where
t3.cdrk >= 3