为什么PostgreSQL在函数中以不同的方式处理我的查询?

我有一个非常简单的查询,它并不复杂:

select *
from table_name
where id = 1234

…运行时间不到50毫秒.

接受该查询并将其放入函数中:

CREATE OR REPLACE FUNCTION pie(id_param integer)
RETURNS SETOF record AS
$BODY$
BEGIN
    RETURN QUERY SELECT *
         FROM table_name
         where id = id_param;
END
$BODY$
LANGUAGE plpgsql STABLE;

执行此功能时从pie(123)中选择*;需要22秒.

如果我硬编码整数代替id_param,则该函数在50毫秒内执行.

为什么我在where语句中使用参数会导致我的函数运行缓慢?

编辑以添加具体示例:

CREATE TYPE test_type AS (gid integer,geocode character varying(9))

CREATE OR REPLACE FUNCTION geocode_route_by_geocode(geocode_param character)
  RETURNS SETOF test_type AS
$BODY$
BEGIN
RETURN QUERY EXECUTE
    'SELECT     gs.geo_shape_id AS gid,gs.geocode
    FROM geo_shapes gs
    WHERE geocode = $1
    AND geo_type = 1 
    GROUP BY geography,gid,geocode' USING geocode_param;
END;

$BODY$
  LANGUAGE plpgsql STABLE;
ALTER FUNCTION geocode_carrier_route_by_geocode(character)
  OWNER TO root;

--Runs in 20 seconds
select * from geocode_route_by_geocode('999xyz');

--Runs in 10 milliseconds
SELECT  gs.geo_shape_id AS gid,gs.geocode
    FROM geo_shapes gs
    WHERE geocode = '9999xyz'
    AND geo_type = 1 
    GROUP BY geography,geocode

解决方法

PostgreSQL 9.2中的更新

有一个重大改进,我引用release notes here

Allow the planner to generate custom plans for specific parameter
values even when using prepared statements (Tom Lane)

In the past,a prepared statement always had a single “generic” plan
that was used for all parameter values,which was frequently much
inferior to the plans used for non-prepared statements containing
explicit constant values. Now,the planner attempts to generate custom
plans for specific parameter values. A generic plan will only be used
after custom plans have repeatedly proven to provide no benefit. This
change should eliminate the performance penalties formerly seen from
use of prepared statements (including non-dynamic statements in
PL/pgSQL).

PostgreSQL 9.1或更早版本的原始答案

plpgsql函数具有与PREPARE语句类似的效果:解析查询并缓存查询计划.

优点是每次调用都会节省一些开销.
缺点是查询计划没有针对调用它的特定参数值进行优化.

对于具有偶数数据分布的表的查询,这通常没有问题,并且PL / pgSQL函数的执行速度比原始SQL查询或SQL函数快一些.但是,如果您的查询可以根据WHERE子句中的实际值使用某些索引,或者更一般地,为特定值选择更好的查询计划,则最终可能会得到次优查询计划.尝试使用SQL函数或使用EXECUTE的动态SQL强制为每次调用重新计划查询.看起来像这样:

CREATE OR REPLACE FUNCTION pie(id_param integer)
RETURNS SETOF record AS
$BODY$
BEGIN        
    RETURN QUERY EXECUTE
        'SELECT *
         FROM   table_name
         where  id = $1'
    USING id_param;
END
$BODY$
LANGUAGE plpgsql STABLE;

评论后编辑:

如果此变体不会改变执行时间,则必须有其他因素可能已经错过或未提及.不同数据库?不同的参数值?你必须发布更多细节.

我添加一个引用from the manual来支持我的上述陈述:

An EXECUTE with a simple constant command string and some USING parameters,as in the first example above,is functionally equivalent to just writing the command directly in PL/pgSQL and allowing replacement of PL/pgSQL variables to happen automatically. The important difference is that EXECUTE will re-plan the command on each execution,generating a plan that is specific to the current parameter values; whereas PL/pgSQL normally creates a generic plan and caches it for re-use. In situations where the best plan depends strongly on the parameter values,EXECUTE can be significantly faster; while when the plan is not sensitive to parameter values,re-planning will be a waste.

相关文章

文章浏览阅读601次。Oracle的数据导入导出是一项基本的技能,...
文章浏览阅读553次。开头还是介绍一下群,如果感兴趣polardb...
文章浏览阅读3.5k次,点赞3次,收藏7次。折腾了两个小时多才...
文章浏览阅读2.7k次。JSON 代表 JavaScript Object Notation...
文章浏览阅读2.9k次,点赞2次,收藏6次。navicat 连接postgr...
文章浏览阅读1.4k次。postgre进阶sql,包含分组排序、JSON解...