问题描述
我只需要将订单分组并跨越其日期范围
方案A。
- 第1订单,1.3.2020-30.6.2020
- order 2,1.5.2020-31.8.2020
- order 3,31.7.2020-31.10.2020
- 第4阶,2020年3月17日至2020年31月12日
所以输出应该是
- 订单1,订单2
- 第2阶,第3阶,第4阶
order1、3、4未被分组,因为它们的范围根本不交叉
场景B。
与上述相同,外加另一个订单
- 5号订单,1.1.2020-31.12.2020
所以输出将是
- 订单1,订单2,订单5
- 2阶,3阶,4阶,5阶
我尝试了“自我加入”以检查哪个开始日期落在该范围内。 因此,在订单1的范围内,仅属于订单2的开始日期->我们有一组 然后在第2阶的范围内落入第3阶和第4阶的开始日期->我们有第二组 但是然后对于第3阶落入第4阶的开始日期,相反,->将给出另外2个组,但它们无效,因为第2阶也在跨越其日期范围,因此也应包括在内,因为将有3个重复项只在所需的输出中显示一次,但是这种方法将失败。
谢谢
解决方法
您可以使用MATCH_RECOGNIZE
查找下一个值的开始日期早于或等于该组中所有先前值的结束日期的组。然后,您可以汇总并排除将完全包含在另一个组中的组:
WITH groups ( id,ids,start_date,end_date ) AS (
SELECT id,LISTAGG( grp_id,',' ) WITHIN GROUP ( ORDER BY start_date ),MIN( start_date ),MIN( end_date )
FROM (
SELECT t.id,x.id AS grp_id,x.start_date,x.end_date
FROM table_name t
INNER JOIN table_name x
ON (
x.start_date >= t.start_date
AND x.start_date <= t.end_date
)
)
MATCH_RECOGNIZE (
PARTITION BY id
ORDER BY start_date
MEASURES
MATCH_NUMBER() AS mno
ALL ROWS PER MATCH
PATTERN ( FIRST_ROW GROUPED_ROWS* )
DEFINE GROUPED_ROWS AS (
GROUPED_ROWS.start_date <= MIN( end_date )
)
)
WHERE mno = 1
GROUP BY id
)
SELECT id,ids
FROM groups g
WHERE NOT EXISTS (
SELECT 1
FROM groups x
WHERE g.ID <> x.ID
AND x.start_date <= g.start_date
AND g.end_date <= x.end_date
)
其中的示例数据:
CREATE TABLE table_name ( id,end_date ) AS
SELECT 'order 1',DATE '2020-03-01',DATE '2020-06-30' FROM DUAL UNION ALL
SELECT 'order 2',DATE '2020-05-01',DATE '2020-08-31' FROM DUAL UNION ALL
SELECT 'order 3',DATE '2020-07-31',DATE '2020-10-31' FROM DUAL UNION ALL
SELECT 'order 4',DATE '2020-12-31' FROM DUAL;
输出:
ID | IDS :------ | :---------------------- order 2 | order 2,order 3,order 4 order 1 | order 1,order 2
那我就是你
INSERT INTO table_name ( id,end_date )
VALUES ( 'order 5',DATE '2020-01-01',DATE '2020-12-31' );
输出为:
ID | IDS :------ | :---------------------- order 2 | order 2,order 4 order 5 | order 5,order 1,order 2
db 提琴here
,MATCH_RECOGNIZE解决方案的结果不正确,因为订单5应该在两组中
我使用一些肛门功能来解决这个问题:
-创建表格
Create table cross_dates (order_id number,start_date date,end_date date);
-插入日期
insert into cross_dates values( 1,to_date('01.03.2020','dd.mm.yyyy'),to_date('30.06.2020','dd.mm.yyyy'));
insert into cross_dates values( 2,to_date('01.05.2020',to_date( '31.08.2020','dd.mm.yyyy'));
insert into cross_dates values( 3,to_date('31.07.2020','dd.mm.yyyy'));
insert into cross_dates values( 4,to_date( '31.10.2020','dd.mm.yyyy'));
insert into cross_dates values( 5,to_date('01.01.2020',to_date( '31.12.2020','dd.mm.yyyy'));
-SQL
select 'Order '|| min_order_id ||': ',listagg( order_id,') within group (order by order_id) list
from (
select distinct min_order_id,order_id from (
with dates (cur_date,end_date,order_id,start_date) as (
select start_date,start_date
from cross_Dates
union all
select cur_date + 1,start_date
from dates
where cur_date < end_date )
select d.order_id,min(d.order_id) over(partition by greatest(d.start_date,cd.start_date)) min_order_id
from dates d,cross_Dates cd
where d.cur_date between cd.start_date and cd.end_date ))
group by min_order_id
having count(*) > 1;
结果:
Order 1: 1,2,5
Order 2: 2,3,4,5
-添加新列并更新旧记录
alter table cross_dates add (item varchar2(1));
update cross_dates set item = 'A';
-插入新记录B
insert into cross_dates values( 1,to_date( '30.06.2020','B');
insert into cross_dates values( 1,to_date('01.07.2020','B');
我的假设:
- A和B是分开的订单,即使相交时也不是同一组
- order 1 B-具有两个连续记录-在我的理解中,像一个订单一样:order 1 B 01.01.2020-21.12.2020
如果我的假设正确,那么SQL可能如下所示:
select distinct min_order_id,item from (
with dates (cur_date,item) as (
select start_date,item
from cross_Dates
union all
select cur_date + 1,item
from dates
where cur_date < end_date )
select d.order_id,d.item,cd.start_date),d.item) min_order_id
from dates d,cross_Dates cd
where d.cur_date between cd.start_date and cd.end_date and d.item = cd.item )
order by item,min_order_id;
结果:
MIN_ORDER_ID ORDER_ID I
1 1 A
1 2 A
1 5 A
2 2 A
2 3 A
2 4 A
2 5 A
5 5 A
1 1 B
如果我的假设不正确,请向我提供这种情况下的结果。
:)