SQL按字段分组,每次分组仅返回一个连接的行

问题描述

data

+-----+----------------+--------+----------------+
| ID  |  required_by   |  Name  |  Another_Field |
+-----+----------------+--------+----------------+
| 1   |  7 August      |  cat   |  X             |
| 2   |  7 August      |  cat   |  Y             |
| 3   |  10 August     |  cat   |  Z             |
| 4   |  11 August     |  dog   |  A             |
+-----+----------------+--------+----------------+

我要按名称分组,然后为每个分组选择日期最早的行之一。

对于此数据集,我想以第1行和第4行或第2行和第4行结束。

预期结果:

+-----+----------------+--------+----------------+
| ID  |  required_by   |  Name  |  Another_Field |
+-----+----------------+--------+----------------+
| 1   |  7 August      |  cat   |  X             |
| 4   |  11 August     |  dog   |  A             |
+-----+----------------+--------+----------------+

OR

+-----+----------------+--------+----------------+
| ID  |  required_by   |  Name  |  Another_Field |
+-----+----------------+--------+----------------+
| 2   |  7 August      |  cat   |  Y             |
| 4   |  11 August     |  dog   |  A             |
+-----+----------------+--------+----------------+

我有返回1,2和4的东西,但是我不确定如何只从第一组中选择一个来获得所需的结果。我正在使用data表加入分组,以便在分组后可以重新获得IDanother_field

SELECT d.id,d.name,d.required_by,d.another_field
FROM 
(
  SELECT min(required_by) as min_date,name
  FROM data
  GROUP BY name
) agg
INNER JOIN 
data d
on d.required_by = agg.min_date AND d.name = agg.name

解决方法

通常使用窗口函数解决此问题:

select d.id,d.name,d.required_by,d.another_field
from (
  select id,name,required_by,another_field,row_number() over (partition by name order by required_by) as rn
  from data
) d
where d.rn = 1;

在Postgres中,使用distinct on()通常更快:

select distinct on (name) *
from data
order by name,required_by

Online example

,
SELECT [id],[date],[name]
  FROM [test].[dbo].[data]  
  WHERE date IN (SELECT min(date) FROM data GROUP BY name)

enter image description here