通过选择最小差异,在多列上连接,在整数列之一中连接

问题描述

我有表t1,我想将其与下面的表t2的a,b和c列连接

+---------+---------+---------+
|a        |b        |c        |
+---------+---------+---------+
|473200   |1        |1.-1-1   |
|472400   |10       |1.-1-1   |
|472800   |10       |1.-1-1   |
|473200   |93       |1.-1-1   |
|472800   |26240    |1.-1-1   |
+---------+---------+---------+

t2

+---------+---------+---------+
|a        |b        |c        |
+---------+---------+---------+
|473200   |1        |1.-1-1   |
|472400   |10       |1.-1-1   |
|472800   |10       |1.-1-1   |
|473200   |93       |1.-1-1   |
|472800   |26250    |1.-1-1   |
+---------+---------+---------+

当我仅加入a和c时,结果是

+---------+---------+---------+---------+
|t1.b     |t2.b     |a        |c        |
+---------+---------+---------+---------+
|93       |1        |473200   |1.-1-1   |
|1        |1        |473200   |1.-1-1   |
|10       |10       |472400   |1.-1-1   |
|10       |10       |472800   |1.-1-1   |
|26240    |10       |472800   |1.-1-1   |
|93       |93       |473200   |1.-1-1   |
|1        |93       |473200   |1.-1-1   |
|10       |26250    |472800   |1.-1-1   |
|26240    |26250    |472800   |1.-1-1   |
+---------+---------+---------+---------+

我想要实现的是将b列添加到'on'子句中,以便在b列的最小差异处进行连接。

所需结果

+---------+---------+---------+---------+
|t1.b     |t2.b     |a        |c        |
+---------+---------+---------+---------+
|1        |1        |473200   |1.-1-1   |
|10       |10       |472400   |1.-1-1   |
|10       |10       |472800   |1.-1-1   |
|93       |93       |473200   |1.-1-1   |
|26240    |26250    |472800   |1.-1-1   |
+---------+---------+---------+---------+

在这里看到了类似的东西

https://dba.stackexchange.com/questions/73804/how-to-retrieve-closest-value-based-on-look-up-table

但不确定如何适用于我的案件。

解决方法

一种选择是横向连接:

select t1.*,t2.b b2
from t1
cross join lateral (
    select t2.*
    from t2
    where t2.a = t1.a and t2.c = t1.c
    order by abs(t2.b - t1.b)
    limit 1
)

另一种可能性是distinct on-但您需要t1的主键。假设(a,c)元组唯一地标识t1中的每一行,您将执行以下操作:

select distinct on (t1.a,t1.c) t1.*,t2.b b2
from t1
inner join t2 on t2.a = t1.a and t2.c = t1.c
order by t1.a,t1.c,abs(t2.b - t1.b)
,

加入表格并计算列c的差异,然后使用distinct on来按差异排序的(a,c)仅返回一行。

with joined as (
  select t1.a,t1.b as b1,t2.b as b2,t2.b - t1.b as b_diff
    from t1
         join t2 
           on t2.a = t1.a
          and t2.b = t1.b
          and t1.b <= t2.b
)
select distinct on (a,c) b1,b2,a,c
  from joined
 order by a,c,b_diff
;