问题描述
有人可以帮忙加入/合并下表吗。
如果部门列(depart_1、depart_2、depart_3)在一张表中,我知道该怎么做。但无法实现此场景,因为它们位于不同的表中。
解决方法
通过使用 JOIN
和 UNION
SELECT
id,name gender,1 as seq,depart_1 as department
FROM tab 1
UNION
SELECT
id,2 as seq,depart_2 as department
FROM tab 1
UNION
SELECt
tab1.id,tab2.name,tab1.gender,3 as seq,tab2.depart_3 as department
FROM tab2 JOIN tab1 on tab2.id = tab1.id
UNION
SELECt
tab1.id,4 as seq,tab2.depart_4 as department
FROM tab2 JOIN tab1 on tab2.id = tab1.id
UNION
SELECT
tab1.id,tab3.name,5 as seq,tab3.depart_5 as department
FROM tab3 JOIN tab1 on tab3.id = tab1.id
UNION
SELECT
tab1.id,6 as seq,tab3.depart_6 as department
FROM tab3 JOIN tab1 on tab3.id = tab1.id
每个查询读取一个部门信息。因此,您可以在 seq 列的每个查询中使用静态数字。
,最好先完成所有联合,然后在最后执行较小的联合。
SELECT tab.id,tab.name,gen.gender,tab.seq,tab.deparment
FROM
(SELECT id,name,depart_1 as department FROM tab 1
UNION
SELECT id,depart_2 as department FROM tab 1
UNION
SELECT id,depart_3 as department FROM tab 2
UNION
SELECT id,depart_4 as department FROM tab 2) tab LEFT JOIN
(SELECT DISTINCT id,gender FROM tab1 ) gen ON tab.id=gen.id
由于您是在 hive 中执行此操作,因此这将自动执行 map-side join,这将使您的查询速度更快。