UNION ALL和LEFT JOIN的区别和使用场景
left join 应该是最常用的一个连接,但是有时候left join也不能包打天下。因为left join是一个获取笛卡尔积的临时表,如果这些的表都是一对一的关系,怎么join都没事。但是实际上我们会遇到很多一对多的对应情况,就需要按照实际情况来判断。
举个例子,表A和表B的关联的关系是一对多的,表A和表C也是一对多的情况,这里A和B,C的关系是关联起来一对多,也有A和B,C关联不上得情况。
CREATE TABLE `A` (
`id` int(11) NOT NULL,
`b_id` int(11) DEFAULT NULL,
`c_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
) ENGINE=InnoDB;
CREATE TABLE `B` (
`id` int(11) NOT NULL,
`a_id` int(11) DEFAULT NULL,
`b_count` int(11) DEFAULT NULL,
`b_staus` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
) ENGINE=InnoDB;
CREATE TABLE `C` (
`id` int(11) NOT NULL,
`a_id` int(11) DEFAULT NULL,
`c_count` int(11) DEFAULT NULL,
`c_staus` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
) ENGINE=InnoDB;
我们这边通过统计数量,b_staus为1取关联B的b_count,c_staus为2时取关联C的c_count。
如果我们用左连接这两张表,因为笛卡尔积会导致结果集的数目会“膨胀”导致结果不对。
——A—— ——B—— ——C——
a * b * c
我们要的结果应该是(ab)+(ac),而左连接解决不了甚至这种abc结合聚合函数会出现某些聚合的函数重复计算的情况,怎么解决这种情况,UNION ALL
一 UNION ALL
如果我们需要将两个select语句的结果作为一个整体显示出来,我们就需要用到union或者union all关键字。union(或称为联合)的作用是将多个结果合并在一起显示出来。
(union/union 是一个"胶水"把两个select结果集“粘”在一起,就如同上边(ab)+(ac),把(ab)和(ac)结合在一起。)
这里要注意:
(1)Union因为要进行重复值扫描,所以效率低。如果合并没有刻意要删除重复行,那么就使用Union All
(2)两个要联合的SQL语句 字段个数必须一样,而且字段类型要“相容”(一致);
1.1 UNION ALL和 UNION以及一些别的区别
Union:对两个结果集进行并集操作,不包括重复行,同时进行默认规则的排序;
Union All:对两个结果集进行并集操作,包括重复行,不进行排序;
Intersect:对两个结果集进行交集操作,不包括重复行,同时进行默认规则的排序;
Minus:对两个结果集进行差操作,不包括重复行,同时进行默认规则的排序。
这里最常用的还是union all,因为别的会做一些去重,排序的小动作,所以常用的结果集不需要,即使要去重也可以根据where的约束条件,或者group by分组的情况下可以搭配其他聚合函数或者having来过滤条件。
二 UNION ALL的使用情况和LEFT JOIN的对比
有一句话其实挺形象的:
union为增加行;left join为增加列
我们可以比喻一下,使用UNION ALL的场景是我们想分部分“叠”在一起,类似下面这种情况:
A
B
C
LEFT JOIN 是两个表A需要关联B表,把B表的一些字段展示出来。
A B
a1 a2 a3 a4 b1 b2 b3 a1
三 UNION ALL和LEFT JOIN与GROUP BY分组聚合函数的结合需要注意的点
3.1 GROUP BY ,HAVING和聚合函数
分组group by分组经常用的,与他搭配的好搭档一个是过滤条件having,另一个是聚合函数。
group a.id
having count > 0
having是填补group by没办法后面添加where条件,通过having可以填补这个,把这个分组结果集进行过滤。
聚合函数常见的有SUM,COUNT ,AVG,MAX,MIN (实际上还有一些冷门的聚合函数VAR,VARP,STDEV ,STDEVP GROUPING,CHECKSUM等,但是一般即使算方差,标准差的也不一定用这些函数,不如导出去直接处理结果。)
3.2 分组与UNION ALL和LEFT JOIN的搭配要点。
这里union all与分组聚合函数尤其注意,一般来说group by放在union all结果集的后面,聚合函数也要注意情况不要轻易放在里面会导致因为聚合函数,行数变少的情况。
这里有一个union all的例子:
SELECT
out_count AS count,
(
out_count - SUM( back_detail_count )) AS could_back_count,
SUM( back_detail_count ) back_count,
a.*
FROM
(
SELECT
aosd.material_archives_id AS material_archives_id,
amc.category_name AS category_name,
ama.brand AS brand,
ama.model AS model,
ama.unit AS unit,
ama.material_name AS material_name,
ama.material_code AS material_code,
aosd.batch_id AS batch_id,
ab.batch_no AS batch_no,
aos.storage_id AS storage_id,
ast.storage_name AS storage_name,
aosd.out_storage_id AS out_storage_id,
aosd.id AS id,
aosd.update_time AS update_time,
aos.out_storage_code AS out_storage_code,
aos.out_storage_date AS out_storage_date,
( CASE WHEN abao.id IS NULL THEN 0 ELSE abaod.apply_count END ) AS back_detail_count,
aosd.count AS out_count
FROM
ams_out_storage_detail aosd
LEFT JOIN ams_out_storage aos ON aos.id = aosd.out_storage_id
AND aos.data_status = 1
LEFT JOIN ams_material_archives ama ON ama.id = aosd.material_archives_id
AND ama.data_status = 1
LEFT JOIN ams_material_category amc ON amc.id = ama.category_id
LEFT JOIN ams_batch ab ON aosd.batch_id = ab.id
LEFT JOIN ams_storage ast ON aos.storage_id = ast.id
LEFT JOIN ams_back_apply_order_detail abaod ON abaod.out_storage_detail_id = aosd.id
LEFT JOIN ams_back_apply_order abao ON abao.id = abaod.back_apply_order_id
AND abao.back_apply_order_status IN ( 1, 6 )
WHERE
aosd.data_status = 1
AND ama.is_support_back_storage = 1
AND aos.out_storage_status = 1
AND aosd.merchant_id = 372
AND aos.employee_id IN ( 7818 ) UNION ALL
SELECT
aosd.material_archives_id AS material_archives_id,
amc.category_name AS category_name,
ama.brand AS brand,
ama.model AS model,
ama.unit AS unit,
ama.material_name AS material_name,
ama.material_code AS material_code,
aosd.batch_id AS batch_id,
ab.batch_no AS batch_no,
aos.storage_id AS storage_id,
ast.storage_name AS storage_name,
aosd.out_storage_id AS out_storage_id,
aosd.id AS id,
aosd.update_time AS update_time,
aos.out_storage_code AS out_storage_code,
aos.out_storage_date AS out_storage_date,
( CASE WHEN abod.id IS NULL THEN 0 ELSE abod.count END ) AS back_detail_count,
aosd.count AS out_count
FROM
ams_out_storage_detail aosd
LEFT JOIN ams_out_storage aos ON aos.id = aosd.out_storage_id
AND aos.data_status = 1
LEFT JOIN ams_material_archives ama ON ama.id = aosd.material_archives_id
AND ama.data_status = 1
LEFT JOIN ams_material_category amc ON amc.id = ama.category_id
LEFT JOIN ams_batch ab ON aosd.batch_id = ab.id
LEFT JOIN ams_storage ast ON aos.storage_id = ast.id
LEFT JOIN ams_back_order_detail abod ON aosd.id = abod.out_storage_detail_id
LEFT JOIN ams_back_order abo ON abo.id = abod.back_order_id
AND abo.back_order_status IN ( 1, 2 )
WHERE
aosd.data_status = 1
AND ama.is_support_back_storage = 1
AND aos.out_storage_status = 1
AND aosd.merchant_id = 372
AND aos.employee_id IN ( 7818 )
) a
GROUP BY
a.id
HAVING
could_back_count > 0
ORDER BY
a.update_time DESC
LIMIT 0,
10
LEFT JOIN 就没有UNION ALL的情况,因为他的目的是扩展列数,与group by,having和聚合函数搭配的时候要注意,不要因为笛卡尔积的意外扩展行数,导致分组聚合的结果膨胀。