问题描述
在SQL(Netezza)中,我有一组〜25,000个案例,我希望与来自约82,000,000个潜在控件池中的每个案例3个控件匹配。我需要控件匹配多个类别变量,并处于多个数值变量的范围内(例如+/- 365天)。我在SQL中使用左联接和窗口函数来完成此操作,该函数为每种情况下的每个潜在控件分配随机数,并在每种情况下选择3种控件。
到目前为止,效果很好,但是只有极少数的案例(〜130)受到了相同的控制。我已经尝试过调整匹配条件,使它们更加宽松,但是我处于这种方法的局限性,并且仍然有少数案例没有收到与分配给其他案例的控件不同的3个控件。本质上,我需要匹配而不替换,但是我无法弄清楚如何为每个控件分配足够唯一的标识符,这样我就可以确保在将其分配给案例后不被重用。
以下是一些示例代码,其中包含非常小的玩具数据集。在此示例中,Sam和Pam有多个匹配项-我希望匹配项为:
--------------
|Pam | Bird |
|Pam | Snake |
|Sam | Goat |
|Sam | Pig |
--------------
不是
--------------
|Pam | Bird |
|Pam | Snake |
|Sam | Snake |
|Sam | Pig |
--------------
尽管示例是从我的数据库程序中精确复制的,但是该示例在Rextester中不起作用-我找不到有关Rextester的任何指南,所以不知道我是否在Rextester中错误地指定了Netezza的使用(或者是否确实是这样)可能)。 Rextester说在分区函数附近有一个语法问题,但是除了我要解决的问题外,它在我的数据库程序中运行良好。我正在使用随机行号的分配来限制每种情况的匹配数(在下面的示例中,每个案例2个控件,在我的实际数据中每个案例3个控件)。
#MySQL 5.7.12
#please drop objects you've created at the end of the script
#or check for their existance before creating
#'\\' is a delimiter
select version() as 'netezza';
create table owners (Name NVARCHAR(4),Energy INT,Walking INT,Swimming INT,Running INT);
insert into owners (Name,Energy,Walking,Swimming,Running)
values
('Tom',4,1,0),('Tim',7,1),('Pam',0)
('Sam',0);
create table animals (Name NVARCHAR(10),Running INT);
insert into animals (Name,Running)
values
('Cat',5,('Horse',('Rat',3,('Pit',('Pug',('Fish',('Bird',2,('Snake',('Goat',0)
('Pig',0);
create table pets as
select Owner_Name,Animal_Name,Owner_Energy,Animal_Energy,Owner_Walking,Animal_Walking,Owner_Swimming,Animal_Swimming,Owner_Running,Animal_Running
from
(select a.Name as Owner_Name,b.Name as Animal_Name,a.Energy as Owner_Energy,b.Energy as Animal_Energy,a.Walking as Owner_Walking,b.Walking as Animal_Walking,a.Swimming as Owner_Swimming,b.Swimming as Animal_Swimming,a.Running as Owner_Running,b.Running as Animal_Running,row_number() over(partition by Owner_Name order by random()) as random_number
from owners as a
left join animals as b
on a.walking = b.walking and a.swimming = b.swimming and a.running = b.running
where a.energy between b.energy - 2 and b.energy + 2) as foo
where random_number < 3;
select * from pets;
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)