在TimescaleDB的gapfill的WHERE子句中使用子查询

问题描述

我想以某种方式运行timescaleDB的gapfill功能,其中开始日期和结束日期是自动生成的。例如,我想在数据库的最大条目和最低条目之间运行gapfill函数

给出数据集游乐场:

CREATE TABLE public.playground (
    value1 numeric,"timestamp" bigint,name "char"
);

INSERT INTO playground(name,value1,timestamp)
VALUES ('test',100,1599100000000000000);

INSERT INTO playground(name,1599100001000000000);

INSERT INTO playground(name,1599300000000000000);

我尝试获取这样的数据:

SELECT time_bucket_gapfill(300E9::BIGINT,timestamp) as bucket
FROM playground
WHERE 
    timestamp >= (SELECT COALESCE(MIN(timestamp),0) FROM playground)
    AND
    timestamp < (SELECT COALESCE(MAX(timestamp),0) FROM playground)
GROUP BY bucket

我得到一个错误

ERROR: missing time_bucket_gapfill argument: Could not infer start from WHERE clause

如果我尝试使用带有硬编码时间戳的查询,则查询运行得很好。 例如:

SELECT time_bucket_gapfill(300E9::BIGINT,timestamp) as bucket
FROM playground
WHERE timestamp >= 0 AND timestamp < 15900000000000000
GROUP BY bucket

在gapfill函数中提供开始日期和结束日期作为参数的另一种方法也失败了。

 WITH bounds AS (
  SELECT COALESCE(MIN(timestamp),0) as min,COALESCE(MAX(timestamp),0) as max
  FROM playground
  WHERE timestamp >= 0 AND timestamp < 15900000000000000
),gapfill as(
SELECT time_bucket_gapfill(300E9::BIGINT,timestamp,bounds.min,bounds.max) as bucket
FROM playground,bounds
GROUP BY bucket
)
select * from gapfill

ERROR: invalid time_bucket_gapfill argument: start must be a simple expression

解决方法

从WHERE子句推断开始和停止,仅支持直接列引用

请参阅:https://github.com/timescale/timescaledb/issues/1345

所以类似的东西可能会起作用,(我没有timescaleDB访问测试) 但是尝试一下:

SELECT
    time_bucket_gapfill(300E9::BIGINT,time_range.min,time_range.max ) AS bucket
FROM
    (
        SELECT
            COALESCE(MIN(timestamp),0)   AS min,COALESCE(MAX(timestamp),0) AS max
        FROM
            playground
    ) AS time_range,playground
WHERE
    timestamp >= time_range.min
    AND timestamp < time_range.max
GROUP BY
    bucket;
,

time_bucket_gapfill仅接受startfinish的值,这些值可以在查询计划时评估为常量。因此,它可以为表达式提供常量和now,但是不能访问表达式中的表。

尽管对time_bucket_gapfill的限制已到位,但无法在单个查询中实现所需的行为。解决方法是分别计算startfinish的值,然后使用time_bucket_gapfill将这些值提供给查询,这可以在存储过程或应用程序中完成。>

旁注,如果在PostgreSQL 12中使用PREPARE statement,出于相同的原因显式disable generic plan很重要。