问题描述
我想在 NULL
列中为每个 device
填充 session_id
值,并使用关联的非 NULL 值。我怎样才能做到这一点?
以下是示例数据:
+------------+-------+---------+
| session_id | step | device |
+------------+-------+---------+
| 351acc | step1 | |
| 351acc | step2 | |
| 351acc | step3 | mobile |
| 351acc | step4 | mobile |
| 350bca | step1 | desktop |
| 350bca | step2 | |
| 350bca | step3 | |
| 350bca | step4 | desktop |
+------------+-------+---------+
期望的输出:
+------------+-------+---------+
| session_id | step | device |
+------------+-------+---------+
| 351acc | step1 | mobile |
| 351acc | step2 | mobile |
| 351acc | step3 | mobile |
| 351acc | step4 | mobile |
| 350bca | step1 | desktop |
| 350bca | step2 | desktop |
| 350bca | step3 | desktop |
| 350bca | step4 | desktop |
+------------+-------+---------+
解决方法
根据您的数据样本,每个会话有一个设备,因此您只需添加一个子查询即可从其他行中获取值
WITH j (session_id,step,device) AS (
VALUES ('351acc','step1',NULL),('351acc','step2','step3','mobile'),'step4',('350bca','desktop'),'desktop')
)
SELECT session_id,(SELECT DISTINCT device
FROM j q2
WHERE q2.session_id = q1.session_id AND q2.device IS NOT NULL) AS device
FROM j q1 ORDER BY session_id,step;
session_id | step | device
------------+-------+---------
350bca | step1 | desktop
350bca | step2 | desktop
350bca | step3 | desktop
350bca | step4 | desktop
351acc | step1 | mobile
351acc | step2 | mobile
351acc | step3 | mobile
351acc | step4 | mobile
(8 Zeilen)
演示:db<>fiddle
顺序正确的 window function first_value()
可能最便宜:
SELECT session_id,COALESCE(device,first_value(device) OVER (PARTITION BY session_id ORDER BY device IS NULL,step)
) AS device
FROM tbl
ORDER BY session_id DESC,step;
dbfiddle here
ORDER BY device IS NULL,step
最后对 NULL
值进行排序,因此选择最早的具有非空值的 step
。见:
如果每个 session_id
的非空设备始终相同,您可以简化为仅 ORDER BY device IS NULL
。而且你不需要COALESCE
。
select session_id,coalesce(device,max(device) over (partition by session_id order by step desc)) device
from table