填写空白日期以获取滚动平均值-Snowflake中的CTE

问题描述

我有两个表– activitypurchase

活动表:

user_id     date      videos_watched
   1     2020-01-02        3
   1     2020-01-04        5
   1     2020-01-07        5

购买表:

user_id  purchase_date 
   1       2020-01-01 
   2       2020-02-02

我想做的是,自购买观看次数以来,获得30天的滚动平均值。

基本查询如下:

    SELECT
    DATEDIFF(DAY,p.purchase_date,a.date) AS day_since_purchase,AVG(A.VIDEOS_VIEWED)
    FROM PURCHASE P
    LEFT OUTER JOIN ACTIVITY A ON P.USER_ID = A.USER_ID AND
        A.DATE >= P.PURCHASE_DATE AND A.DATE <= DATEADD(DAY,30,P.PURCHASE_DATE)
    GROUP BY 1;

但是,Activity表仅记录了每天录制视频的记录。我想填空几天没有被观看的视频。

我已经开始考虑使用像这样的CTE

    WITH cte AS (
        SELECT date('2020-01-01') as fdate
        UNION ALL
        SELECT CAST(DATEADD(day,1,fdate) as date)
    FROM cte
    WHERE fdate < date('2020-04-01')
    ) select * from cte 
      cross join purchases p
      left outer join activity a 
      on p.user id = a.user_id 
      and a.fdate = p.purchase_date
      and a.date >= p.purchase_date and a.date <= dateadd(day,p.purchase_date)

最终目标是拥有这样的东西:

days_since_purchase    videos_watched
        1                   3
        2                   0 --CTE coalesce inserted value
        3                   0
        4                   5

在尝试了最后几个小时以使其正确时,但仍然无法真正掌握它。

解决方法

如果要填补结果集中的空白,那么我认为您应该生成整数而不是日期:

WITH cte AS (
      SELECT 1 as day_since_purchase
      UNION ALL
      SELECT 1 + day_since_purchase
      FROM cte
      WHERE day_since_purchase < 4
     )
SELECT cte.day_since_purchase,COALESCE(avg_videos_viewed,0)
FROM cte LEFT JOIN
     (SELECT DATEDIFF(DAY,p.purchase_date,a.date) AS day_since_purchase,AVG(A.VIDEOS_VIEWED) as avg_videos_viewed
      FROM purchases p JOIN
           activity a 
           ON p.user id = a.user_id AND
              a.fdate = p.purchase_date AND
              a.date >= p.purchase_date AND
              a.date <= dateadd(day,30,p.purchase_date)
      GROUP BY 1
     ) pa
     ON pa.day_since_purchase = cte.day_since_purchase;
,

您可以使用递归查询生成每次购买后的30天,然后携带活动表:

with cte as (
    select 
        purchase_date,client_id,0 days_since_purchase,purchase_date dt
    from purchases 
    union all
    select 
        purchase_date,days_since_purchase + 1
        dateadd(day,days_since_purchase + 1,purchase_date)
    from cte
    where days_since_purchase < 30

)
select 
    c.days_since_purchase,avg(colaesce(a. videos_watch,0)) avg_ videos_watch
from cte c
left join activity a
    on  a.client_id = c.client_id
    and a.fdate = c.purchase_date
    and a.date = c.dt
group by c.days_since_purchase

您的问题尚不清楚,activity表中是否有一列存储与每一行相关的购买日期。您的查询具有列fdate,但没有示例数据。我在查询中使用了该列(没有该列,您可能最终会在不同购买中算出相同的活动)。

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...