找出条件mysql 5.7下每个用户的时差

问题描述

这是我的小提琴https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=7c549a3de0c8002ec43381462ba6a801

假设我有这样的数据

CREATE TABLE test (
  ID INT,user_id INT,createdAt DATE,status_id INT
);

INSERT INTO test VALUES
  (1,12,'2020-01-01',4),(2,'2020-01-03',7),(3,'2020-01-06',(4,13,'2020-01-02',5),(5,6),(6,14,'2020-03-03',8),(7,'2020-03-04',(8,15,'2020-04-04',(9,'2020-03-02',(10,'2020-03-10',(11,'2020-04-10',8);
  
select * from test
order by createdAt;

这是选择(*)后的表格

+----+---------+------------+-----------+
| ID | user_id | createdAt  | status_id |
+----+---------+------------+-----------+
|  1 |      12 | 2020-01-01 |         4 |
|  4 |      13 | 2020-01-02 |         5 |
|  2 |      12 | 2020-01-03 |         7 |
|  5 |      13 | 2020-01-03 |         6 |
|  3 |      12 | 2020-01-06 |         7 |
|  9 |      14 | 2020-03-02 |         6 |
|  6 |      14 | 2020-03-03 |         8 |
|  7 |      13 | 2020-03-04 |         4 |
| 10 |      14 | 2020-03-10 |         5 |
|  8 |      15 | 2020-04-04 |         7 |
| 11 |      13 | 2020-04-10 |         8 |
+----+---------+------------+-----------+

id是交易的ID,user_Id是进行交易的用户的ID,createdAt是交易发生的日期,status_id是交易的状态(如果status_Id为7,则拒绝交易或未批准)。

因此,在这种情况下,我想找出每个重复用户在'2020-02-01'到'2020-04-01'之间的时间范围内每个批准交易的时差,重复用户是在该时间范围结束之前进行交易,并且在该时间范围内至少再次进行了1次交易,在这种情况下,用户在“ 2020-04-01”之前进行了批准交易,而在“ 2020-04-01”之间用户至少又进行了1次批准交易2020-02-01”和“ 2020-04-01”。

根据说明,我使用了该查询

SELECT SUM(transactions) AS transactions,MIN(`MIN`) AS `MIN`,MAX(`MAX`) AS `MAX`,SUM(total) / SUM(transactions) AS `AVG`
FROM (
  SELECT user_id,COUNT(*) AS transactions,MIN(diff) AS `MIN`,MAX(diff) AS `MAX`,SUM(diff) AS total
  FROM (
    SELECT user_id,DATEDIFF((SELECT MIN(t2.createdAt)
                              FROM test t2
                              WHERE t2.user_id = t1.user_id
                                AND t1.createdAt < t2.createdAt
                                AND t2.status_id in (4,5,6,8)
                              ),t1.createdAt) AS diff
    FROM test t1
    WHERE status_id in (4,8)
    HAVING SUM(status_id != 7 and createdAt < '2020-04-01') > 1
               AND SUM(status_id != 7 AND createdAt BETWEEN '2020-02-01'
               AND '2020-04-01')
  ) DiffTable
  WHERE diff IS NOT NULL
  GROUP BY user_id
) totals

它说

In aggregated query without GROUP BY,expression #1 of SELECT list contains nonaggregated column 'db_314931870.t1.user_id'; this is incompatible with sql_mode=only_full_group_by

预期结果

+-----+-----+---------+
| MIN | MAX |   AVG   |
+-----+-----+---------+
|   1 |  61 | 21,6667 |
+-----+-----+---------+

说明:最小值(最小值)是1天差异,发生于“ 2020-03-02”中进行批准交易并在“ 2020-03-03”中再次进行批准交易的users_id 14,最大值(最大值)为在“ 2020-01-03”中进行批准交易的users_Id 13中发生的61时差 然后在“ 2020-03-04”中再次进行批准交易,平均时间差是根据时间范围内所有时间差之和得出:计数交易发生在该时间范围内

解决方法

SELECT MIN(DATEDIFF(t2.createdAt,t1.createdAt)) min_diff,MAX(DATEDIFF(t2.createdAt,t1.createdAt)) max_diff,AVG(DATEDIFF(t2.createdAt,t1.createdAt)) avg_diff
FROM test t1
JOIN test t2 ON t1.user_id = t2.user_id 
            AND t1.createdAt < t2.createdAt
            AND 7 NOT IN (t1.status_id,t2.status_id)
JOIN (SELECT t3.user_id
      FROM test t3
      WHERE t3.status_id != 7
      GROUP BY t3.user_id
      HAVING SUM(t3.createdAt < '2020-04-01')
         AND SUM(t3.createdAt BETWEEN '2020-02-01' AND '2020-04-01')) t4 ON t1.user_id = t4.user_id
WHERE NOT EXISTS (SELECT NULL
                  FROM test t5
                  WHERE t1.user_id = t5.user_id
                    AND t5.status_id != 7
                    AND t1.createdAt < t5.createdAt
                    AND t5.createdAt < t2.createdAt)

fiddle,并附有简短说明。