问题描述
好的,这是简化表格,可以帮助解释我的情况
FactListingCreated:
ListingCreatedSk
CreatedDateSk
ListingSk
StateSk
DimListing
ListingSk
ListingBk
StateCode
ListingPrice (SCD2)
ListingStatus (SCD2)
RowEffectiveDate
RowExpirationDate
RowCurrentIndicator
我遇到的问题是,当我将维度中的更新合并到基本事务事实表中时,由于维度上的SCD2更改而添加了新行,结果我的事实中出现了重复的条目(相同的ListingBK)。处理这些情况的最佳方法是,有了我们的关键约束,就是我们希望事实中的每一行都指向维度表中的原始Sk。
当前过程:
MERGE INTO dbo.FactListingCreated AS target
USING
(
SELECT dlm.CreatedDateSk,dl.ListingSk,CASE
WHEN db.brokerageSk IS NULL THEN -1
ELSE db.brokerageSk
END as brokerageSk,ds.StateSk
FROM stage.DimListingMerge as dlm
LEFT JOIN dbo.DimDate as dd
ON dd.DateSk = dlm.CreatedDateSk
LEFT JOIN dbo.DimListing as dl
ON dl.ListingBk = dlm.ListingBk
AND dl.RowCurrentIndicator = 1
LEFT JOIN dbo.Dimbrokerage as db
ON db.brokerageBk = dlm.brokerageBk
LEFT JOIN dbo.Dimstate as ds
ON ds.StateCode = dlm.StateCode
) source
ON (target.ListingSk = source.ListingSk)
THEN UPDATE SET
target.CreatedDateSk = source.CreatedDateSk,target.brokerageSk = source.brokerageSk,target.StateSk = source.StateSk
WHEN NOT MATCHED THEN
INSERT VALUES
(
source.CreatedDateSk,source.ListingSk,source.StateSk
);
因此,我认为此过程可用于更新(仅提取前一天的数据),但是,最好的方法只是进行一次单独的初始运行(将所有数据从dim提取),在其中提取初始行每条记录?还是我错过了一个非常明显的东西,而这将使单个存储过程成为可能?
解决方法
当您加载引用SCD2维的事实表时,您需要从具有相同BK的多个记录中选择维记录,该记录适用于事实“事件”日期(事件日期是什么时候)您的事实由您的业务逻辑决定,因此它可能是创建日期,生效日期或其他日期。假设它是一个称为EventDate的列...
您的SQL JOIN需要看起来像这样:
LEFT JOIN dbo.DimListing as dl
ON dl.ListingBk = dlm.ListingBk
AND dlm.EventDate BETWEEN dl.RowEffectiveDate AND dl.RowExpirationDate
此刻,您的SQL只是从维度中获取所有事实记录的当前行,因此,我怀疑这是您重复的原因