正确的 Set 操作在一组集合中找到匹配的集合,还是完全连接? TLDR简化示例:预期结果当前使用游标的方法......唉

问题描述

TLDR

如何将一组集合与单个集合进行匹配并将其绑定到相应的行?

给定一行,其中包含一个链接汇总表,其中包含描述该行属性的键/值对,以及一组描述如何汇总该行内容搜索描述(目标),我如何找到哪个搜索- 描述匹配给定的行,基于匹配属性表与搜索描述中的键/值对?

简化示例:

CREATE TABLE Targetkeyvalue(TargetId INT,TargetKey NVARCHAR(50),TargetValue NVARCHAR(50))
CREATE TABLE OriginalRows(Id INT,Cost DECIMAL,BunchOfOtherCols NVARCHAR(500),CONSTRAINT [PK_Id] PRIMARY KEY CLUSTERED ([Id] ASC))
CREATE TABLE Rowkeyvalue(RowId INT,KeyPart NVARCHAR(50),ValuePart NVARCHAR(50),CONSTRAINT [FK_RowId_Id] FOREIGN KEY (RowId) REFERENCES OriginalRows(Id))

INSERT INTO OriginalRows VALUES
    (1,55.5,'Some cool red coat'),(2,80.0,'Some cool green coat XL'),(3,250.00,'Some cool green coat L'),(4,100.0,'Some whiskey'),(5,42.0,'This is not a match')

INSERT INTO Rowkeyvalue VALUES
    (1,'Color','Red'),(1,'Size','XL'),'Kind','Coat'),'Green'),'L'),'Medium'),'Whiskey')


INSERT INTO Targetkeyvalue VALUES
    (55,(56,(57,(58,'Whiskey')

这给出了以下表格:


-- table OriginalRows
Id  Cost    BunchOfOtherCols
1   56      Some cool red coat
2   80      Some cool green coat XL
3   250     Some cool green coat L
4   100     Some whiskey
5   42      This is not a match

-- table Rowkeyvalue
RowId   KeyPart ValuePart
1       Color   Red
1       Size    XL
1       Kind    Coat
2       Color   Green
2       Size    XL
2       Kind    Coat
3       Color   Green
3       Size    L
3       Kind    Coat
4       Color   Green
4       Size    Medium
4       Kind    Whiskey

-- table Targetkeyvalue
TargetId    TargetKey   TargetValue
55          Color       Red
56          Color       Green
56          Size        XL
57          Kind        Coat
58          Color       Green
58          Size        Medium
58          Kind        Whiskey

预期结果

下面的函数将给出正确的结果:

Id  Cost    BunchOfOtherCols            IsTargetMatch   TargetKeyId
1   56      Some cool red coat          1               55
2   80      Some cool green coat XL     1               56
3   250     Some cool green coat L      1               57
4   100     Some whiskey                1               58
5   42      This is not a match         0               NULL

换句话说:

  • 将原始行 id 绑定到它第一次匹配的目标 id(如果更容易,我可以多次返回连接)
  • 不匹配时显示原始行
  • 如果属于一个目标 ID 的组与给定原始行的相同值匹配,则匹配为真

当前使用游标的方法......唉

下面的代码使用游标,但这证明速度很慢(可以理解,因为它基本上只是一次又一次的非索引表扫描)。

我尝试过的另一种方法是使用 XML PATH 查询,但结果证明这不是入门(这很容易,但也太慢了)。

我知道这是关系数据库中的一项重要任务,但我希望仍然有一个相当简单的解决方案。我在下面的东西有点工作,我可能只是使用批处理来存储结果或其他东西,除非有更好的方法来使用 SET 操作或,idunno,FULL JOIN

可以在视图中使用的任何解决方案(即,不涉及动态 sql调用 SP)都可以。我们曾经有一个基于 SP 的解决方案,但由于需要在 PowerBI 和其他系统中分析数据,sql 视图和确定性是要走的路。

这是我所追求的一个完全有效的最小示例。该函数是我希望用更少的程序和更多的功能(即基于集合的方法)替换的部分:

CREATE TABLE Targetkeyvalue(TargetId INT,'Whiskey')

GO


CREATE FUNCTION dbo.MatchTargetAgainstKeysFromrow
(
    @rowid INT
)
RETURNS @MatchResults TABLE(
    IsTargetMatch BIT,TargetKeyId INT)

AS
BEGIN

    --
    -- METHOD (1) (faster,by materializing the xml field into a cross-over lookup table)
    --

    -- single row from activities as key/value pairs multi-row
    DECLARE @rowAskeyvalue AS TABLE(KeyPart NVARCHAR(1000),ValuePart NVARCHAR(MAX))
    INSERT INTO @rowAskeyvalue (KeyPart,ValuePart)
        SELECT KeyPart,ValuePart FROM Rowkeyvalue WHERE RowId = @rowid


    DECLARE @LookupColumn NVARCHAR(100)
    DECLARE @LookupValue NVARCHAR(max)
    DECLARE @TargetId INT
    DECLARE @CurrentTargetId INT
    DECLARE @IsMatch INT
    DECLARE key_Cursor CURSOR
        LOCAL STATIC FORWARD_ONLY READ_ONLY
        FOR SELECT TargetKey,TargetValue,TargetId FROM Targetkeyvalue  ORDER BY TargetId

    OPEN key_Cursor
    FETCH NEXT FROM key_Cursor INTO @LookupColumn,@LookupValue,@TargetId

    WHILE @@FETCH_STATUS = 0
    BEGIN
        SET @IsMatch = (SELECT COUNT(*) FROM @rowAskeyvalue WHERE KeyPart = @LookupColumn AND ValuePart = @LookupValue)
        IF(@IsMatch = 0)
        BEGIN
            -- move to next key that isn't the current key
            SET @CurrentTargetId = @TargetId
            WHILE @@FETCH_STATUS = 0 AND @CurrentTargetId = @TargetId
            BEGIN
                FETCH NEXT FROM key_Cursor INTO @LookupColumn,@TargetId
            END
        END
        ELSE
            BEGIN
                SET @CurrentTargetId = @TargetId
                WHILE @@FETCH_STATUS = 0 AND @IsMatch > 0 AND @CurrentTargetId = @TargetId
                BEGIN
                    FETCH NEXT FROM key_Cursor INTO @LookupColumn,@TargetId
                    IF @CurrentTargetId = @TargetId
                        SET @IsMatch = (SELECT COUNT(*) FROM @rowAskeyvalue WHERE KeyPart = @LookupColumn AND ValuePart = @LookupValue)
                END
                IF @IsMatch > 0
                BEGIN
                    -- we found a positive matching key,nothing more to do
                    BREAK
                END
            END
    END

    DEALLOCATE key_Cursor       -- deallocating a cursor also closes it

    INSERT @MatchResults
    SELECT
        (CASE WHEN (SELECT COUNT(*) FROM @rowAskeyvalue) > 0 THEN 1 ELSE 0 END),(CASE WHEN @IsMatch > 0 THEN @CurrentTargetId ELSE NULL END)

    RETURN
END

GO

-- function in action
select * from OriginalRows
cross apply dbo.MatchTargetAgainstKeysFromrow(Id) fn

-- cleanup
drop function dbo.MatchTargetAgainstKeysFromrow
drop table Targetkeyvalue
drop table Rowkeyvalue
drop table OriginalRows

解决方法

这个问题是一个Relational Division With Remainder的情况,有多个被除数和除数。

关系划分基本上与连接相反:在这种情况下,我们想知道哪个 let elapsed = Int(d2.timeIntervalSince(d1) * 1000) 匹配哪个 OriginalRows,基于 TargetIds 匹配的每个键/值对TargetId 的键/值对。

有很多方法可以做到这一点,这里有一些:

OriginalRows
SELECT
    r.Id,r.Cost,r.BunchOfOtherCols,t.TargetId
FROM OriginalRows r
OUTER APPLY (
    SELECT ttKV.TargetId
    FROM TargetKeyValue tKV
    LEFT JOIN RowKeyValue rKV
        ON rKV.KeyPart = tKV.TargetKey AND rKV.ValuePart = tKV.TargetValue
        AND rKV.RowId = r.Id
    GROUP BY tKV.TargetId
    HAVING COUNT(*) = COUNT(rKV.RowId)  -- all target k/vs have match
) t;
SELECT
    r.Id,tKV.TargetId
FROM OriginalRows r
CROSS JOIN TargetKeyValue tKV
LEFT JOIN RowKeyValue rKV
    ON rKV.KeyPart = tKV.TargetKey AND rKV.ValuePart = tKV.TargetValue
    AND rKV.RowId = r.Id
GROUP BY
    r.Id,tKV.TargetId
HAVING COUNT(*) = COUNT(rKV.RowId)  -- all target k/vs have match

你也可以用

代替SELECT r.Id,tKV.TargetId FROM OriginalRows r CROSS JOIN TargetKeyValue tKV CROSS APPLY (VALUES (CASE WHEN EXISTS (SELECT 1 FROM RowKeyValue rKV WHERE rKV.KeyPart = tKV.TargetKey AND rKV.ValuePart = tKV.TargetValue AND rKV.RowId = r.Id ) THEN 1 END ) ) rKV(IsMatch) GROUP BY r.Id,tKV.TargetId HAVING COUNT(*) = COUNT(rKV.IsMatch) -- all target k/vs have match
HAVING COUNT(*) = COUNT(rKV.RowId)

如果您想要单个 HAVING COUNT(CASE WHEN rKV.RowId IS NULL THEN 1 END) = 0 -- all target k/vs have match 的函数,那就更简单了:

OriginalRows

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...