芝法酱学习笔记(2.2)——sql性能优化2
一、前言
在上一节中,我们使用实验的方式,验证了销售单报表的一些sql性能优化的猜想。但实验结果出乎我们的意料,首先是时间查询使用char和datetime相比,char可能更快,使用bigint(转为秒)和char速度相当。其次是最令人不可理解的是,在连表的时候,直接使用主键id做连表,竟然远远比使用多条件联合索引的连表方式更慢。
小编苦思冥想,认为在千万级数据量下,bigint的筛选效率可能没有联合索引高。那么,如果我们把主键设置为联合主键,在连表时会不会更快呢?那么,就开干吧
二、表设计
2.1 item_new表
我们新建一个item表,表结构和索引如下图所示:
主键:enp_id,id
不添加任何其他索引
2.2 consign_new表
consign表也类似,我们也用联合主键
主键:enp_id,header_id,id
三、数据同步导入
为了控制变量,我们不重新生成数据,我们把上次的数据查询出来并相应的插入新表
@Override
public void syncNewTable() {
// 查出所有企业
List<GenEnterpriseEntity> enterpriseEntities = mEnterpriseDbService.list();
for(GenEnterpriseEntity enterpriseEntity : enterpriseEntities) {
log.info("开始导入"+enterpriseEntity.getName()+"数据");
List<GenItemEntity> itemEntityList = mItemDbService.listByEnpId(enterpriseEntity.getId());
List<GenItemNewEntity> itemNewEntityList = new ArrayList<>();
for(GenItemEntity itemEntity : itemEntityList) {
GenItemNewEntity itemNewEntity = new GenItemNewEntity();
itemNewEntity.createInit();
itemNewEntity.setEnpId(itemEntity.getId());
itemNewEntity.setEnpCode(itemEntity.getEnpCode());
itemNewEntity.setId(itemEntity.getId());
itemNewEntity.setName(itemEntity.getName());
itemNewEntity.setCost(itemEntity.getCost());
itemNewEntity.setTestData(true);
itemNewEntityList.add(itemNewEntity);
}
TransactionTemplate template = new TransactionTemplate(mTransactionManager);
template.execute(status ->{
mItemNewDbService.saveBatch(itemNewEntityList);
return true;
});
log.info("导入商品完成");
final String DAY_BEGIN = "2018-01-01";
final String DAY_END = "2024-12-31";
LocalDate startDate = LocalDate.parse(DAY_BEGIN);
LocalDate endDate = LocalDate.parse(DAY_END);
while (!startDate.isAfter(endDate)) {
log.info("导入"+startDate+"的销售单数据");
LocalDateTime billTimeBeg = startDate.atTime(0, 0, 0);
LocalDateTime billTimeEnd = startDate.atTime(23, 59, 59);
long billTimeKeyBeg = CommonUtil.LocalDateTimeToSecond(billTimeBeg);
long billTimeKeyEnd = CommonUtil.LocalDateTimeToSecond(billTimeEnd);
List<GenConsignEntity> consignEntityList = mConsignDbService.findAll(enterpriseEntity.getId(),billTimeKeyBeg,billTimeKeyEnd);
List<GenConsignNewEntity> consignNewEntityList = new ArrayList<>();
for(GenConsignEntity consignEntity : consignEntityList) {
GenConsignNewEntity consignNewEntity = new GenConsignNewEntity();
consignNewEntity.createInit();
consignNewEntity.setId(consignEntity.getId());
consignNewEntity.setEnpId(consignEntity.getEnpId());
consignNewEntity.setHeaderId(consignEntity.getHeaderId());
consignNewEntity.setBillTimeKey(consignEntity.getBillTimeKey());
consignNewEntity.setItemId(consignEntity.getItemId());
consignNewEntity.setItemName(consignEntity.getItemName());
consignNewEntity.setItemCnt(consignEntity.getItemCnt());
consignNewEntity.setPrice(consignEntity.getPrice());
consignNewEntity.setDescription(consignEntity.getDescription());
consignNewEntity.setTestData(true);
consignNewEntityList.add(consignNewEntity);
}
consignNewEntityList.sort(Comparator.comparing(GenConsignNewEntity::getHeaderId));
template = new TransactionTemplate(mTransactionManager);
template.execute(status -> {
mConsignNewDbService.saveBatch(consignNewEntityList);
return true;
});
log.info(startDate+"的销售单数据导入完成");
startDate = startDate.plusDays(1l);
}
}
}
四、mapper改写
4.1 枚举
@RequiredArgsConstructor
@EnumDesc
public enum EHeaderJoinMode {
NONE(0,"不连表","不连表,直接使用consign表做查询",null,null),
ID_JOIN(1,"id关联","consign_header的id与consign的header_id做关联","id","header_id"),
BILL_NO_JOIN(2,"订单号关联","header表的enp_id和bill_no与consin相应字段关联","bill_no","bill_no"),
NEW_CONSIGN(3,"新consign表","enp_id和header_id做连接","id","header_id");
@EnumValue
@Getter
private final int code;
@Getter
private final String name;
@Getter
private final String desc;
@Getter
private final String headerCol;
@Getter
private final String consignCol;
}
@RequiredArgsConstructor
@EnumDesc
public enum EItemJoinMode {
NONE(0,"不连接","不连接item表",null,null),
ID_JOIN(1,"id连接","使用id链接item","item_id","id"),
STR_ID_JOIN(2,"字符串id连接","使用字符串id做连接","item_str_id","id"),
REL_ID_JOIN(3,"关联id连接","不但使用字符串id做连接,item表也不用主键","item_str_id","rel_id"),
NEW_TABLE(4,"和item_new做连接","和item_new做连接,不但连id,还连enp_id","item_id","id");
@EnumValue
@Getter
private final int code;
@Getter
private final String name;
@Getter
private final String desc;
@Getter
private final String consignCol;
@Getter
private final String itemCol;
}
4.2 xml改写
这里不放完整代码了,就改了2处
<choose>
<when test="IN.headerJoin.name() == 'ID_JOIN'">
consign_header h JOIN consign c ON h.${IN.headerJoin.headerCol} = c.${IN.headerJoin.consignCol}
</when>
<when test="IN.headerJoin.name() == 'BILL_NO_JOIN'">
consign_header h JOIN consign c ON h.${IN.headerJoin.headerCol} = c.${IN.headerJoin.consignCol} AND h.enp_id = c.enp_id
</when>
<!--新增模式-->
<when test="IN.headerJoin.name() == 'NEW_CONSIGN'">
consign_header h JOIN consign_new c ON h.${IN.headerJoin.headerCol} = c.${IN.headerJoin.consignCol} AND h.enp_id = c.enp_id
</when>
<otherwise>
consign c
</otherwise>
</choose>
<choose>
<when test="IN.itemJoin.name() == 'ID_JOIN'">
JOIN item i ON c.${IN.itemJoin.consignCol} = i.${IN.itemJoin.itemCol}
</when>
<when test="IN.itemJoin.name() == 'STR_ID_JOIN'">
JOIN item i ON c.${IN.itemJoin.consignCol} = i.${IN.itemJoin.itemCol}
</when>
<when test="IN.itemJoin.name() == 'REL_ID_JOIN'">
JOIN item i ON c.${IN.itemJoin.consignCol} = i.${IN.itemJoin.itemCol} AND c.enp_id = i.enp_id
</when>
<!--新增模式-->
<when test="IN.itemJoin.name() == 'NEW_TABLE'">
JOIN item i ON c.${IN.itemJoin.consignCol} = i.${IN.itemJoin.itemCol} AND c.enp_id = i.enp_id
</when>
</choose>
五、实验
请求json
{
"current": 2,
"size": 10,
"enterpriseId": 1869035491194941444,
"billTimeBeg": "2024-04-01",
"billTimeEnd": "2024-07-31",
"headerJoin": "NEW_CONSIGN",
"itemJoin": "NEW_TABLE",
"orderBy": "PROFIT",
"billTimeMode": "BILL_TIME_KEY"
}
生成sql:
explain SELECT c.item_id,c.item_name,
SUM(c.item_cnt) AS total_cnt,
SUM(c.price * c.item_cnt) AS total_amount,
SUM((c.price - i.cost) * c.item_cnt) AS total_profit
FROM consign_header h JOIN consign_new c ON h.id = c.header_id AND h.enp_id = c.enp_id
JOIN item i ON c.item_id = i.id AND c.enp_id = i.enp_id
WHERE h.enp_id = 1869035491194941444
AND h.bill_time_key BETWEEN 1711900800 AND 1722441599
GROUP BY item_id
ORDER BY total_profit LIMIT 10,10
explain结果
select_type | table | type | key | ken_ken | rows | filtered |
---|---|---|---|---|---|---|
simple | h | rang | idx_time_key | 16 | 11516 | 100 |
simple | c | ref | PRIMARY | 16 | 43 | 100 |
simple | i | ref | PRIMARY | 8 | 1 | 100 |
执行时间
enp_id | enp_code | cnt | time |
---|---|---|---|
1869035491194941442 | enp_001 | 248814 | 2.638s |
1869035491194941443 | enp_002 | 263780 | 2.285s |
1869035491194941444 | enp_003 | 120522 | 1.157s |
1869035491194941445 | enp_004 | 84262 | 1.003s |
1869035491194941446 | enp_005 | 174673 | 4.157s |
1869035491194941447 | enp_006 | 342751 | 4.105s |
1869035491194941448 | enp_007 | 52964 | 0.48s |
1869035491194941449 | enp_008 | 172159 | 3.895s |
1869035491194941450 | enp_009 | 181632 | 4.688s |
1869035491194941451 | enp_010 | 188382 | 5.168s |
先前使用id主键连表的执行时间
enp_id | enp_code | cnt | time |
---|---|---|---|
1869035491194941442 | enp_001 | 248814 | 19.311s |
1869035491194941443 | enp_002 | 263780 | 18.534s |
1869035491194941444 | enp_003 | 120522 | 13.849s |
1869035491194941445 | enp_004 | 84262 | 5.782s |
1869035491194941446 | enp_005 | 174673 | 21.158s |
1869035491194941447 | enp_006 | 342751 | 20.927s |
1869035491194941448 | enp_007 | 52964 | 3.087s |
1869035491194941449 | enp_008 | 172159 | 19.982s |
1869035491194941450 | enp_009 | 181632 | 23.256s |
1869035491194941451 | enp_010 | 188382 | 26.057s |
结论:
效率比先前,提升了6~8倍!!!
六、经验总结
在设计表的时候,如果有明确的父子表层级关系(1对多),并且数据量很大,子表的主键直接设计成联合主键。
比如本案例中的,enterprise -> item,consign_header; consign_header->consign
七、一些其他筛选参数
上节我们还有一些其他筛选参数的情况没有测试,诸如门店id,业务员id等
群友们可以自行测试,我这里就不耗费篇幅了。
结论就是按照预期走了索引。
八、order by问题
上期由于篇幅原因,还有一个问题没有实验,就是order by id使得主键索引覆盖了正常的索引。
小编懒得写代码做实验了,我们直接写sql吧:
ORDER BY id DESC:
SELECT * FROM consign_header h
WHERE h.enp_id = 1869035491194941447
AND h.bill_time_key BETWEEN 1711900800 AND 1722441599
ORDER BY id DESC LIMIT 100;
在workbench的执行时间:0.047S
ORDER BY bill_time_key DESC:
explain SELECT * FROM consign_header h
WHERE h.enp_id = 1869035491194941447
AND h.bill_time_key BETWEEN 1711900800 AND 1722441599
ORDER BY h.bill_time_key DESC LIMIT 100;
在workbench的执行时间:0.015S,并且第二次执行会因缓存变为0秒
我们可以看到,order by id 会使得查找条件不走索引,而走了主键,并且速度显著降低。