获取同一组中有多个记录的记录

问题描述

我有一个客户端会议列表,该会议由Python3.8从所有已安排会议的csv中添加到SQLite3数据库中(每次安排新会议后更新后手动下载)。有时会重新安排会议,即使同一季度每个客户只有一个“季度”会议,同一个人在该季度也会举行多个预定会议,并且季度不是基于日历年,而是因客户而异。有时,除了定期的季度会议外,还有“特别”会议。日历的季度从Q1到Q4,但是日历可能从Q2到Q1结束,具体取决于客户年份与日历年度的比较。

所以我想做的是返回每个季度的所有重复的客户会议,以便我可以手动删除/检查它们并将其标记为“特殊”或“其他”。当添加记录时,Python将根据日期和客户各年的开始来计算QTR值。

如果有另一种方法可以做到这一点,我很想听听。

模式(SQLite v3.30)

CREATE TABLE "Meetings" (
    "id_pk" INTEGER NOT NULL,"Hipaa" TEXT NOT NULL,"Meeting_Date"  TEXT NOT NULL,"CN_Date"   TEXT,"QTR"   TEXT,"Date_Added"    TEXT,"Annual"    TEXT,"FLAG"  TEXT,UNIQUE("Hipaa","Meeting_Date"),PRIMARY KEY("id_pk")
)

查询#1

insert into Meetings ("Hipaa","Meeting_Date","QTR","FLAG")
values 

( "JonesTom","2020-01-03","Q1","Regular" ),( "JonesTom","2020-04-06","Q2","2020-07-10","Q3","2020-10-15","Q4","2021-01-10",( "ConnSar","2020-02-04","2020-05-07","2020-08-11","2020-11-02","2020-11-16","2021-02-12",( "ZuckMark","2019-01-14","2019-01-17","2020-05-20","2020-07-05","2020-07-21","2020-10-20","2020-11-06","2020-01-02","Regular" )
;

查询#2

select * from Meetings;
| id_pk | Hipaa    | Meeting_Date | CN_Date | QTR | Date_Added | Annual | FLAG    |
| ----- | -------- | ------------ | ------- | --- | ---------- | ------ | ------- |
| 1     | JonesTom | 2020-01-03   |         | Q1  |            |        | Regular |
| 2     | JonesTom | 2020-04-06   |         | Q2  |            |        | Regular |
| 3     | JonesTom | 2020-07-10   |         | Q3  |            |        | Regular |
| 4     | JonesTom | 2020-10-15   |         | Q4  |            |        | Regular |
| 5     | JonesTom | 2021-01-10   |         | Q1  |            |        | Regular |
| 6     | ConnSar  | 2020-02-04   |         | Q1  |            |        | Regular |
| 7     | ConnSar  | 2020-05-07   |         | Q2  |            |        | Regular |
| 8     | ConnSar  | 2020-08-11   |         | Q3  |            |        | Regular |
| 9     | ConnSar  | 2020-11-02   |         | Q4  |            |        | Regular |
| 10    | ConnSar  | 2020-11-16   |         | Q4  |            |        | Regular |
| 11    | ConnSar  | 2021-02-12   |         | Q1  |            |        | Regular |
| 12    | ZuckMark | 2019-01-14   |         | Q3  |            |        | Regular |
| 13    | ZuckMark | 2019-01-17   |         | Q3  |            |        | Regular |
| 14    | ZuckMark | 2020-05-20   |         | Q4  |            |        | Regular |
| 15    | ZuckMark | 2020-07-05   |         | Q1  |            |        | Regular |
| 16    | ZuckMark | 2020-07-21   |         | Q1  |            |        | Regular |
| 17    | ZuckMark | 2020-10-20   |         | Q2  |            |        | Regular |
| 18    | ZuckMark | 2020-11-06   |         | Q2  |            |        | Regular |
| 19    | ZuckMark | 2020-01-02   |         | Q3  |            |        | Regular |

所需结果

| id_pk | Hipaa    | Meeting_Date | CN_Date | QTR | Date_Added | Annual | FLAG    |
| ----- | -------- | ------------ | ------- | --- | ---------- | ------ | ------- |
| 9     | ConnSar  | 2020-11-02   |         | Q4  |            |        | Regular |
| 10    | ConnSar  | 2020-11-16   |         | Q4  |            |        | Regular |
| 12    | ZuckMark | 2019-01-14   |         | Q3  |            |        | Regular |
| 13    | ZuckMark | 2019-01-17   |         | Q3  |            |        | Regular |
| 15    | ZuckMark | 2020-07-05   |         | Q1  |            |        | Regular |
| 16    | ZuckMark | 2020-07-21   |         | Q1  |            |        | Regular |
| 17    | ZuckMark | 2020-10-20   |         | Q2  |            |        | Regular |
| 18    | ZuckMark | 2020-11-06   |         | Q2  |            |        | Regular |

View on DB Fiddle

解决方法

如果我的理解正确,那么您希望在(hipaa,qtr)上重复。您可以使用exists

select m.*
from meetings m
where exists (
    select 1 
    from meetings m1 
    where m1.hipaa = m.hipaa and m1.qtr = m.qtr and m1.id_pk <> m.id_pk
)

另一个选择是窗口计数:

select *
from (
    select m.*,count(*) over(partition by hipaa,qtr) cnt
    from meetings m
) m
where cnt > 1
,

使用EXISTS

select m.* from Meetings m
where exists (
  select 1 from Meetings 
  where Hipaa = m.Hipaa 
  and strftime('%Y',Meeting_Date) = strftime('%Y',m.Meeting_Date)
  and QTR = m.QTR 
  and id_pk <> m.id_pk
)

请参见demo

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...