01问题
前期开发反馈在使用独立子查询时,不论子查询中结果集有几个,语句都会卡死迟迟得不到返回结果。但是如果去掉子查询,直接赋值查询很快得到返回结果。听到这个情况第一反应很可能DBLE在独立子查询上没有做相关优化,真的是这样么?下面我们将问题复现以及优化方式进行展示。
02 演示及优化
环境检查
DBLE版本:2.19.11.5
MySQL版本:5.7.28
涉及分片表:src_biz_filelist
分片键:batch_no
涉及垂直表:src_image_txn_jnl
分片拆分规则:stringhash
节点数量:4
2.1 原始语句及执行计划
独立子查询语句如下:
select caption,batch_no,image_no from src_biz_filelist where batch_no in (SELECT imageid FROM src_image_txn_jnl WHERE front_seq_num = '001780568097' and txn_dt = '2029-08-20')
执行计划如下:
mysql> explain select caption,batch_no,image_no from src_biz_filelist where batch_no in (SELECT imageid FROM src_image_txn_jnl WHERE front_seq_num = '001780568097' and txn_dt = '2029-08-20');
+-------------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE | SQL/REF |
+-------------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dn1_0 | BASE SQL | select `src_biz_filelist`.`ysyid`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no`,`src_biz_filelist`.`caption`,`src_biz_filelist`.`create_time`,`src_biz_filelist`.`update_time`,`src_biz_filelist`.`img_seql`,`src_biz_filelist`.`data_date` from `src_biz_filelist` ORDER BY `src_biz_filelist`.`batch_no` ASC |
| dn2_0 | BASE SQL | select `src_biz_filelist`.`ysyid`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no`,`src_biz_filelist`.`caption`,`src_biz_filelist`.`create_time`,`src_biz_filelist`.`update_time`,`src_biz_filelist`.`img_seql`,`src_biz_filelist`.`data_date` from `src_biz_filelist` ORDER BY `src_biz_filelist`.`batch_no` ASC |
| dn3_0 | BASE SQL | select `src_biz_filelist`.`ysyid`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no`,`src_biz_filelist`.`caption`,`src_biz_filelist`.`create_time`,`src_biz_filelist`.`update_time`,`src_biz_filelist`.`img_seql`,`src_biz_filelist`.`data_date` from `src_biz_filelist` ORDER BY `src_biz_filelist`.`batch_no` ASC |
| dn4_0 | BASE SQL | select `src_biz_filelist`.`ysyid`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no`,`src_biz_filelist`.`caption`,`src_biz_filelist`.`create_time`,`src_biz_filelist`.`update_time`,`src_biz_filelist`.`img_seql`,`src_biz_filelist`.`data_date` from `src_biz_filelist` ORDER BY `src_biz_filelist`.`batch_no` ASC |
| merge_and_order_1 | MERGE_AND_ORDER | dn1_0; dn2_0; dn3_0; dn4_0 |
| shuffle_field_1 | SHUFFLE_FIELD | merge_and_order_1 |
| dn1_1 | BASE SQL | select `autoalias_src_image_txn_jnl`.`autoalias_scalar` from (select distinct `src_image_txn_jnl`.`imageid` as `autoalias_scalar` from `src_image_txn_jnl` where (`src_image_txn_jnl`.`front_seq_num` = '001780568097') AND (`src_image_txn_jnl`.`txn_dt` = '2029-08-20') order by autoalias_scalar ASC) autoalias_src_image_txn_jnl order by `autoalias_src_image_txn_jnl`.`autoalias_scalar` ASC |
| merge_1 | MERGE | dn1_1 |
| join_1 | JOIN | shuffle_field_1; merge_1 |
| shuffle_field_2 | SHUFFLE_FIELD | join_1 |
+-------------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
10 rows in set (0.00 sec)
**执行计划可见,DBLE对语句进行了拆分。首先分别在4个数据节点全表扫描分片表src_biz_filelist,将各自返回的结果集在DBLE层合并排序。然后将子查询结果去重排序后返回。最后在DBLE层进行匹配操作。当分片表数据量较大时,全部提取数据动作,即使不与子查询中结果集匹配,效率也不会很好。
**
按照开发描述改写子查询语句赋予具体值,进行查看:
mysql> explain select caption,batch_no,image_no from src_biz_filelist where batch_no in ('0250002403989000IMMEDA200630095447WV','1650002432532000IMMEDA200630095233WS');
+-----------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE | SQL/REF |
+-----------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| dn4 | BASE SQL | select caption,batch_no,image_no from src_biz_filelist where batch_no in ('0250002403989000IMMEDA200630095447WV','1650002432532000IMMEDA200630095233WS') |
+-----------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------+
1 rows in set (0.00 sec)
因为分片表src_biz_filelist的分片键为batch_no,语句直接下压到了数据所在的dn4节点。验证了开发描述的在使用独立子查询时,不论子查询中结果集有几个,语句都会卡死迟迟得不到返回结果。如果去掉子查询赋予具体值,很快得到返回结果。程序进行语句拆分传入具体值查询,也是我们的优化方法之一。那是不是DBLE在独立子查询上真的没有做相关优化呢?我们继续操作。
**2.2 **独立标量子查询语句及执行计划
如果明确子查询结果单个值情况下,可以改写使用独立标量子查询,执行计划如下:
mysql> explain select caption,batch_no,image_no from src_biz_filelist where batch_no = (SELECT imageid FROM src_image_txn_jnl WHERE front_seq_num = '001780568097' and txn_dt = '2029-08-20');
+--------------------+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE | SQL/REF |
+--------------------+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dn1_0 | BASE SQL | select `src_image_txn_jnl`.`imageid` as `autoalias_scalar` from `src_image_txn_jnl` where (`src_image_txn_jnl`.`front_seq_num` = '001780568097') AND (`src_image_txn_jnl`.`txn_dt` = '2029-08-20') limit 0,2 |
| merge_1 | MERGE | dn1_0 |
| scalar_sub_query_1 | SCALAR_SUB_QUERY | merge_1 |
| dn1_1 | BASE SQL(May No Need) | scalar_sub_query_1; select `src_biz_filelist`.`caption`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no` from `src_biz_filelist` where `src_biz_filelist`.`batch_no` = '{NEED_TO_REPLACE}' |
| dn2_0 | BASE SQL(May No Need) | scalar_sub_query_1; select `src_biz_filelist`.`caption`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no` from `src_biz_filelist` where `src_biz_filelist`.`batch_no` = '{NEED_TO_REPLACE}' |
| dn3_0 | BASE SQL(May No Need) | scalar_sub_query_1; select `src_biz_filelist`.`caption`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no` from `src_biz_filelist` where `src_biz_filelist`.`batch_no` = '{NEED_TO_REPLACE}' |
| dn4_0 | BASE SQL(May No Need) | scalar_sub_query_1; select `src_biz_filelist`.`caption`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no` from `src_biz_filelist` where `src_biz_filelist`.`batch_no` = '{NEED_TO_REPLACE}' |
| merge_2 | MERGE | dn1_1; dn2_0; dn3_0; dn4_0 |
| shuffle_field_1 | SHUFFLE_FIELD | merge_2 |
+--------------------+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
9 rows in set (0.00 sec)
执行计划可见:dn1_0部分先将子查询结果返回,出现了SCALAR_SUB_QUERY标量子查询的标识,被引用到外部查询中,语句直接下压到了全部数据节点。这种情况下依据分片键进行等值查询,效率是OK的。
****03 解决
通过查看原始语句在DBLE的执行计划,下压到MySQL时,由于DBLE对语句进行了重写,并没有使用到MySQL优化器改写语句为semi join方式。在MySQL中需要使用子查询完成的任务,很多时候可以使用连接方式来实现,那么我们在上层就重写为join语句实验一下。
查看DBLE文档server.xml 系统参数配置部分有一个useJoinStrategy参数,该参数作用是开启之后会尝试判断join两边的where来重新调整查询SQL下发的顺序,默认关闭。添加true至server.xml的标签中,并重启DBLE。
查看改写之后语句执行计划:
mysql> explain select t1.caption,t1.batch_no,t1.image_no from src_biz_filelist t1 inner join src_image_txn_jnl t2 on t1.batch_no=t2.imageid WHERE t2.front_seq_num = '001780568097' and t2.txn_dt = '2029-08-20';
+-------------------+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE | SQL/REF |
+-------------------+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dn1_0 | BASE SQL | select `t2`.`id`,`t2`.`evt_id`,`t2`.`front_seq_num`,`t2`.`org_id`,`t2`.`teller_id`,`t2`.`txn_dt`,`t2`.`txn_tm`,`t2`.`txn_type_cd`,`t2`.`txn_cd`,`t2`.`sub_seq_num`,`t2`.`bcs_seq_num`,`t2`.`user_ccy_cd`,`t2`.`acct_num`,`t2`.`acct_ccy_cd`,`t2`.`acct_amt`,`t2`.`txn_ccy_cd`,`t2`.`txn_amt`,`t2`.`exch_rate`,`t2`.`resp_num`,`t2`.`proc_info`,`t2`.`auth_teller_id`,`t2`.`terminal_num`,`t2`.`cntrpty_acct_num`,`t2`.`cntrpty_acct_name`,`t2`.`vchr_type_encode`,`t2`.`vchr_num`,`t2`.`vchr_dt`,`t2`.`chk_teller_id`,`t2`.`vchr_main_file_desc`,`t2`.`vchr_accessory_desc`,`t2`.`biz_status_cd`,`t2`.`fin_txn_ind`,`t2`.`txn_complt_dt`,`t2`.`ident_chk_chker_id`,`t2`.`ident_chk_result_cd`,`t2`.`veri_seal_chker_id`,`t2`.`veri_seal_result_cd`,`t2`.`tel_che_chker_id`,`t2`.`tel_che_result_cd`,`t2`.`tel_che_comment`,`t2`.`pay_acct_doubt_type_cd`,`t2`.`rcv_acct_doubt_type_cd`,`t2`.`job_center`,`t2`.`imageid`,`t2`.`pay_name`,`t2`.`txn_desc`,`t2`.`origin_txn_seq_num`,`t2`.`txn_start_tm`,`t2`.`txn_memo`,`t2`.`bcs_or_back_txn_num`,`t2`.`bea_intrfc_num`,`t2`.`txn_auth_teller`,`t2`.`send_back_or_back_err_info`,`t2`.`data_date` from `src_image_txn_jnl` `t2` where (`t2`.`front_seq_num` = '001780568097') AND (`t2`.`txn_dt` = '2029-08-20') order by `t2`.`imageid` ASC |
| merge_1 | MERGE | dn1_0 |
| nest_loop_1 | NEST_LOOP | merge_1 |
| dn1_1 | BASE SQL(May No Need) | nest_loop_1's RESULTS; select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from `src_biz_filelist` `t1` where t1.batch_no IN ('{NEED_TO_REPLACE}') ORDER BY `t1`.`batch_no` ASC |
| dn2_0 | BASE SQL(May No Need) | nest_loop_1's RESULTS; select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from `src_biz_filelist` `t1` where t1.batch_no IN ('{NEED_TO_REPLACE}') ORDER BY `t1`.`batch_no` ASC |
| dn3_0 | BASE SQL(May No Need) | nest_loop_1's RESULTS; select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from `src_biz_filelist` `t1` where t1.batch_no IN ('{NEED_TO_REPLACE}') ORDER BY `t1`.`batch_no` ASC |
| dn4_0 | BASE SQL(May No Need) | nest_loop_1's RESULTS; select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from `src_biz_filelist` `t1` where t1.batch_no IN ('{NEED_TO_REPLACE}') ORDER BY `t1`.`batch_no` ASC |
| merge_and_order_1 | MERGE_AND_ORDER | dn1_1; dn2_0; dn3_0; dn4_0 |
| shuffle_field_1 | SHUFFLE_FIELD | merge_and_order_1 |
| join_1 | JOIN | nest_loop_1; shuffle_field_1 |
| shuffle_field_2 | SHUFFLE_FIELD | join_1 |
+-------------------+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
11 rows in set (0.00 sec)
执行计划可见:dn1_0部分先将子查询结果返回,出现了NEST_LOOP嵌套循环标识,嵌套循环的结果集被引用到外部查询中,语句直接下压到了全部数据节点。
开启MySQL general_log 日志查看语句实际在MySQL层运行情况如下:
2021-05-12T08:58:09.324017+08:00 1080 Query select `t2`.`id`,`t2`.`evt_id`,`t2`.`front_seq_num`,`t2`.`org_id`,`t2`.`teller_id`,`t2`.`txn_dt`,`t2`.`txn_tm`,`t2`.`txn_type_cd`,`t2`.`txn_cd`,`t2`.`sub_seq_num`,`t2`.`bcs_seq_num`,`t2`.`user_ccy_cd`,`t2`.`acct_num`,`t2`.`acct_ccy_cd`,`t2`.`acct_amt`,`t2`.`txn_ccy_cd`,`t2`.`txn_amt`,`t2`.`exch_rate`,`t2`.`resp_num`,`t2`.`proc_info`,`t2`.`auth_teller_id`,`t2`.`terminal_num`,`t2`.`cntrpty_acct_num`,`t2`.`cntrpty_acct_name`,`t2`.`vchr_type_encode`,`t2`.`vchr_num`,`t2`.`vchr_dt`,`t2`.`chk_teller_id`,`t2`.`vchr_main_file_desc`,`t2`.`vchr_accessory_desc`,`t2`.`biz_status_cd`,`t2`.`fin_txn_ind`,`t2`.`txn_complt_dt`,`t2`.`ident_chk_chker_id`,`t2`.`ident_chk_result_cd`,`t2`.`veri_seal_chker_id`,`t2`.`veri_seal_result_cd`,`t2`.`tel_che_chker_id`,`t2`.`tel_che_result_cd`,`t2`.`tel_che_comment`,`t2`.`pay_acct_doubt_type_cd`,`t2`.`rcv_acct_doubt_type_cd`,`t2`.`job_center`,`t2`.`imageid`,`t2`.`pay_name`,`t2`.`txn_desc`,`t2`.`origin_txn_seq_num`,`t2`.`txn_start_tm`,`t2`.`txn_memo`,`t2`.`bcs_or_back_txn_num`,`t2`.`bea_intrfc_num`,`t2`.`txn_auth_teller`,`t2`.`send_back_or_back_err_info`,`t2`.`data_date` from `src_image_txn_jnl` `t2` where (`t2`.`front_seq_num` = '001780568097') AND (`t2`.`txn_dt` = '2029-08-20') order by `t2`.`imageid` ASC
2021-05-12T08:58:09.326400+08:00 1091 Query select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from `src_biz_filelist` `t1` where t1.batch_no IN ('0250002403989000IMMEDA200630095447WV', '0250002401013000IMMEDA200914092919JX') ORDER BY `t1`.`batch_no` ASC
语句执行顺序变化,独立子查询被赋予了具体值,并按分片键进行数据路由的特性,下发到指定分片节点进行查询。
需要注意的是关于nestLoop还有两个参数nestLoopConnSize /nestLoopRowsSize 若临时表行数大于这两个值乘积,DBLE则会报告一个后端连接错误。
综上优化方式有以下几种:
1、程序进行语句拆分传入具体值进行查询。
2、如果明确子查询结果单个值情况下,可以改写使用独立标量子查询。
3、添加 true 参数,并改写为连接查询。