问题描述
这是我的小提琴https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=7c549a3de0c8002ec43381462ba6a801
假设我有这样的数据
CREATE TABLE test (
ID INT,user_id INT,createdAt DATE,status_id INT
);
INSERT INTO test VALUES
(1,12,'2020-01-01',4),(2,'2020-01-03',7),(3,'2020-01-06',(4,13,'2020-01-02',5),(5,6),(6,14,'2020-03-03',8),(7,'2020-03-04',(8,15,'2020-04-04',(9,'2020-03-02',(10,'2020-03-10',(11,'2020-04-10',8);
select * from test
order by createdAt;
这是选择(*)后的表格
+----+---------+------------+-----------+
| ID | user_id | createdAt | status_id |
+----+---------+------------+-----------+
| 1 | 12 | 2020-01-01 | 4 |
| 4 | 13 | 2020-01-02 | 5 |
| 2 | 12 | 2020-01-03 | 7 |
| 5 | 13 | 2020-01-03 | 6 |
| 3 | 12 | 2020-01-06 | 7 |
| 9 | 14 | 2020-03-02 | 6 |
| 6 | 14 | 2020-03-03 | 8 |
| 7 | 13 | 2020-03-04 | 4 |
| 10 | 14 | 2020-03-10 | 5 |
| 8 | 15 | 2020-04-04 | 7 |
| 11 | 13 | 2020-04-10 | 8 |
+----+---------+------------+-----------+
id是交易的ID,user_Id是进行交易的用户的ID,createdAt是交易发生的日期,status_id是交易的状态(如果status_Id为7,则拒绝交易或未批准)。
因此,在这种情况下,我想找出每个重复用户在'2020-02-01'到'2020-04-01'之间的时间范围内每个批准交易的时差,重复用户是在该时间范围结束之前进行交易,并且在该时间范围内至少再次进行了1次交易,在这种情况下,用户在“ 2020-04-01”之前进行了批准交易,而在“ 2020-04-01”之间用户至少又进行了1次批准交易2020-02-01”和“ 2020-04-01”。
根据说明,我使用了该查询
SELECT SUM(transactions) AS transactions,MIN(`MIN`) AS `MIN`,MAX(`MAX`) AS `MAX`,SUM(total) / SUM(transactions) AS `AVG`
FROM (
SELECT user_id,COUNT(*) AS transactions,MIN(diff) AS `MIN`,MAX(diff) AS `MAX`,SUM(diff) AS total
FROM (
SELECT user_id,DATEDIFF((SELECT MIN(t2.createdAt)
FROM test t2
WHERE t2.user_id = t1.user_id
AND t1.createdAt < t2.createdAt
AND t2.status_id in (4,5,6,8)
),t1.createdAt) AS diff
FROM test t1
WHERE status_id in (4,8)
HAVING SUM(status_id != 7 and createdAt < '2020-04-01') > 1
AND SUM(status_id != 7 AND createdAt BETWEEN '2020-02-01'
AND '2020-04-01')
) DiffTable
WHERE diff IS NOT NULL
GROUP BY user_id
) totals
它说
In aggregated query without GROUP BY,expression #1 of SELECT list contains nonaggregated column 'db_314931870.t1.user_id'; this is incompatible with sql_mode=only_full_group_by
预期结果
+-----+-----+---------+
| MIN | MAX | AVG |
+-----+-----+---------+
| 1 | 61 | 21,6667 |
+-----+-----+---------+
说明:最小值(最小值)是1天差异,发生于“ 2020-03-02”中进行批准交易并在“ 2020-03-03”中再次进行批准交易的users_id 14,最大值(最大值)为在“ 2020-01-03”中进行批准交易的users_Id 13中发生的61时差 然后在“ 2020-03-04”中再次进行批准交易,平均时间差是根据时间范围内所有时间差之和得出:计数交易发生在该时间范围内
解决方法
SELECT MIN(DATEDIFF(t2.createdAt,t1.createdAt)) min_diff,MAX(DATEDIFF(t2.createdAt,t1.createdAt)) max_diff,AVG(DATEDIFF(t2.createdAt,t1.createdAt)) avg_diff
FROM test t1
JOIN test t2 ON t1.user_id = t2.user_id
AND t1.createdAt < t2.createdAt
AND 7 NOT IN (t1.status_id,t2.status_id)
JOIN (SELECT t3.user_id
FROM test t3
WHERE t3.status_id != 7
GROUP BY t3.user_id
HAVING SUM(t3.createdAt < '2020-04-01')
AND SUM(t3.createdAt BETWEEN '2020-02-01' AND '2020-04-01')) t4 ON t1.user_id = t4.user_id
WHERE NOT EXISTS (SELECT NULL
FROM test t5
WHERE t1.user_id = t5.user_id
AND t5.status_id != 7
AND t1.createdAt < t5.createdAt
AND t5.createdAt < t2.createdAt)
fiddle,并附有简短说明。