问题描述
TLDR
如何将一组集合与单个集合进行匹配并将其绑定到相应的行?
给定一行,其中包含一个链接汇总表,其中包含描述该行属性的键/值对,以及一组描述如何汇总该行内容的搜索描述(目标),我如何找到哪个搜索- 描述匹配给定的行,基于匹配属性表与搜索描述中的键/值对?
简化示例:
CREATE TABLE Targetkeyvalue(TargetId INT,TargetKey NVARCHAR(50),TargetValue NVARCHAR(50))
CREATE TABLE OriginalRows(Id INT,Cost DECIMAL,BunchOfOtherCols NVARCHAR(500),CONSTRAINT [PK_Id] PRIMARY KEY CLUSTERED ([Id] ASC))
CREATE TABLE Rowkeyvalue(RowId INT,KeyPart NVARCHAR(50),ValuePart NVARCHAR(50),CONSTRAINT [FK_RowId_Id] FOREIGN KEY (RowId) REFERENCES OriginalRows(Id))
INSERT INTO OriginalRows VALUES
(1,55.5,'Some cool red coat'),(2,80.0,'Some cool green coat XL'),(3,250.00,'Some cool green coat L'),(4,100.0,'Some whiskey'),(5,42.0,'This is not a match')
INSERT INTO Rowkeyvalue VALUES
(1,'Color','Red'),(1,'Size','XL'),'Kind','Coat'),'Green'),'L'),'Medium'),'Whiskey')
INSERT INTO Targetkeyvalue VALUES
(55,(56,(57,(58,'Whiskey')
这给出了以下表格:
-- table OriginalRows
Id Cost BunchOfOtherCols
1 56 Some cool red coat
2 80 Some cool green coat XL
3 250 Some cool green coat L
4 100 Some whiskey
5 42 This is not a match
-- table Rowkeyvalue
RowId KeyPart ValuePart
1 Color Red
1 Size XL
1 Kind Coat
2 Color Green
2 Size XL
2 Kind Coat
3 Color Green
3 Size L
3 Kind Coat
4 Color Green
4 Size Medium
4 Kind Whiskey
-- table Targetkeyvalue
TargetId TargetKey TargetValue
55 Color Red
56 Color Green
56 Size XL
57 Kind Coat
58 Color Green
58 Size Medium
58 Kind Whiskey
预期结果
下面的函数将给出正确的结果:
Id Cost BunchOfOtherCols IsTargetMatch TargetKeyId
1 56 Some cool red coat 1 55
2 80 Some cool green coat XL 1 56
3 250 Some cool green coat L 1 57
4 100 Some whiskey 1 58
5 42 This is not a match 0 NULL
换句话说:
当前使用游标的方法......唉
下面的代码使用游标,但这证明速度很慢(可以理解,因为它基本上只是一次又一次的非索引表扫描)。
我尝试过的另一种方法是使用 XML PATH 查询,但结果证明这不是入门(这很容易,但也太慢了)。
我知道这是关系数据库中的一项重要任务,但我希望仍然有一个相当简单的解决方案。我在下面的东西有点工作,我可能只是使用批处理来存储结果或其他东西,除非有更好的方法来使用 SET
操作或,idunno,FULL JOIN
?
可以在视图中使用的任何解决方案(即,不涉及动态 sql 或调用 SP)都可以。我们曾经有一个基于 SP 的解决方案,但由于需要在 PowerBI 和其他系统中分析数据,sql 视图和确定性是要走的路。
这是我所追求的一个完全有效的最小示例。该函数是我希望用更少的程序和更多的功能(即基于集合的方法)替换的部分:
CREATE TABLE Targetkeyvalue(TargetId INT,'Whiskey')
GO
CREATE FUNCTION dbo.MatchTargetAgainstKeysFromrow
(
@rowid INT
)
RETURNS @MatchResults TABLE(
IsTargetMatch BIT,TargetKeyId INT)
AS
BEGIN
--
-- METHOD (1) (faster,by materializing the xml field into a cross-over lookup table)
--
-- single row from activities as key/value pairs multi-row
DECLARE @rowAskeyvalue AS TABLE(KeyPart NVARCHAR(1000),ValuePart NVARCHAR(MAX))
INSERT INTO @rowAskeyvalue (KeyPart,ValuePart)
SELECT KeyPart,ValuePart FROM Rowkeyvalue WHERE RowId = @rowid
DECLARE @LookupColumn NVARCHAR(100)
DECLARE @LookupValue NVARCHAR(max)
DECLARE @TargetId INT
DECLARE @CurrentTargetId INT
DECLARE @IsMatch INT
DECLARE key_Cursor CURSOR
LOCAL STATIC FORWARD_ONLY READ_ONLY
FOR SELECT TargetKey,TargetValue,TargetId FROM Targetkeyvalue ORDER BY TargetId
OPEN key_Cursor
FETCH NEXT FROM key_Cursor INTO @LookupColumn,@LookupValue,@TargetId
WHILE @@FETCH_STATUS = 0
BEGIN
SET @IsMatch = (SELECT COUNT(*) FROM @rowAskeyvalue WHERE KeyPart = @LookupColumn AND ValuePart = @LookupValue)
IF(@IsMatch = 0)
BEGIN
-- move to next key that isn't the current key
SET @CurrentTargetId = @TargetId
WHILE @@FETCH_STATUS = 0 AND @CurrentTargetId = @TargetId
BEGIN
FETCH NEXT FROM key_Cursor INTO @LookupColumn,@TargetId
END
END
ELSE
BEGIN
SET @CurrentTargetId = @TargetId
WHILE @@FETCH_STATUS = 0 AND @IsMatch > 0 AND @CurrentTargetId = @TargetId
BEGIN
FETCH NEXT FROM key_Cursor INTO @LookupColumn,@TargetId
IF @CurrentTargetId = @TargetId
SET @IsMatch = (SELECT COUNT(*) FROM @rowAskeyvalue WHERE KeyPart = @LookupColumn AND ValuePart = @LookupValue)
END
IF @IsMatch > 0
BEGIN
-- we found a positive matching key,nothing more to do
BREAK
END
END
END
DEALLOCATE key_Cursor -- deallocating a cursor also closes it
INSERT @MatchResults
SELECT
(CASE WHEN (SELECT COUNT(*) FROM @rowAskeyvalue) > 0 THEN 1 ELSE 0 END),(CASE WHEN @IsMatch > 0 THEN @CurrentTargetId ELSE NULL END)
RETURN
END
GO
-- function in action
select * from OriginalRows
cross apply dbo.MatchTargetAgainstKeysFromrow(Id) fn
-- cleanup
drop function dbo.MatchTargetAgainstKeysFromrow
drop table Targetkeyvalue
drop table Rowkeyvalue
drop table OriginalRows
解决方法
这个问题是一个Relational Division With Remainder的情况,有多个被除数和除数。
关系划分基本上与连接相反:在这种情况下,我们想知道哪个 let elapsed = Int(d2.timeIntervalSince(d1) * 1000)
匹配哪个 OriginalRows
,基于 TargetIds
匹配的每个键/值对TargetId
的键/值对。
有很多方法可以做到这一点,这里有一些:
OriginalRows
SELECT
r.Id,r.Cost,r.BunchOfOtherCols,t.TargetId
FROM OriginalRows r
OUTER APPLY (
SELECT ttKV.TargetId
FROM TargetKeyValue tKV
LEFT JOIN RowKeyValue rKV
ON rKV.KeyPart = tKV.TargetKey AND rKV.ValuePart = tKV.TargetValue
AND rKV.RowId = r.Id
GROUP BY tKV.TargetId
HAVING COUNT(*) = COUNT(rKV.RowId) -- all target k/vs have match
) t;
SELECT
r.Id,tKV.TargetId
FROM OriginalRows r
CROSS JOIN TargetKeyValue tKV
LEFT JOIN RowKeyValue rKV
ON rKV.KeyPart = tKV.TargetKey AND rKV.ValuePart = tKV.TargetValue
AND rKV.RowId = r.Id
GROUP BY
r.Id,tKV.TargetId
HAVING COUNT(*) = COUNT(rKV.RowId) -- all target k/vs have match
你也可以用
代替SELECT
r.Id,tKV.TargetId
FROM OriginalRows r
CROSS JOIN TargetKeyValue tKV
CROSS APPLY (VALUES (CASE WHEN EXISTS (SELECT 1
FROM RowKeyValue rKV
WHERE rKV.KeyPart = tKV.TargetKey AND rKV.ValuePart = tKV.TargetValue
AND rKV.RowId = r.Id
) THEN 1 END
) ) rKV(IsMatch)
GROUP BY
r.Id,tKV.TargetId
HAVING COUNT(*) = COUNT(rKV.IsMatch) -- all target k/vs have match
HAVING COUNT(*) = COUNT(rKV.RowId)
如果您想要单个 HAVING COUNT(CASE WHEN rKV.RowId IS NULL THEN 1 END) = 0 -- all target k/vs have match
的函数,那就更简单了:
OriginalRows