问题描述
编写一个查询来识别返回的活跃用户。回访活跃用户是指在第一次购买后的 7 天内进行了第二次购买的用户。
id u_id item created_at revenue
1 109 milk 3/3/2020 0:00 123
2 139 biscuit 3/18/2020 0:00 421
3 120 milk 3/18/2020 0:00 176
4 108 banana 3/18/2020 0:00 862
5 130 milk 3/28/2020 0:00 333
6 103 bread 3/29/2020 0:00 862
7 122 banana 3/7/2020 0:00 952
8 125 bread 3/13/2020 0:00 317
9 139 bread 3/23/2020 0:00 929
10 141 banana 3/17/2020 0:00 812
11 116 bread 3/31/2020 0:00 226
12 128 bread 3/4/2020 0:00 112
13 146 biscuit 3/4/2020 0:00 362
14 119 banana 3/28/2020 0:00 127
解决方法
您可以使用窗口函数获取最早的创建日期,然后在一周内查找其他记录:
select distinct u_id
from (select t.*,min(created_at) over (partition by u_id) as min_created_at
from t
) t
where created_at > min_created_at and
created_at < min_created_at + interval 7 day;
,
如果您只检查客户第一次购买,并在接下来的 7 天内第二次访问,您将在第二次访问后放弃第三次购买。 只需在 7 天的间隔内全局检查两次购买,如下所示:
create table t(id integer,u_id integer,item varchar(100),created_at date,revenue float);
insert into t
values (1,109,"milk",STR_TO_DATE("3/3/2020",'%m/%d/%Y'),123),(2,139,"biscuit",STR_TO_DATE("3/18/2020",421),(3,120,176),(4,108,"banana",862),(5,130,STR_TO_DATE("3/28/2020",333),(6,103,"bread",STR_TO_DATE("3/29/2020",(7,122,STR_TO_DATE("3/7/2020",952),(8,125,STR_TO_DATE("3/13/2020",317),(9,STR_TO_DATE("3/23/2020",929),(10,141,STR_TO_DATE("3/17/2020",812),(11,116,STR_TO_DATE("3/31/2020",226),(12,128,STR_TO_DATE("3/4/2020",112),(13,146,362),(14,119,127);
select * from t as t1 where exists (select * from t as t2 where t1.u_id = t2.u_id and t1.created_at - t2.created_at > 0 and t1.created_at - t2.created_at <= 7 );