PostgreSQL:与 sql 相比，plpgsql 语言的相同请求速度较慢

我是 PostgreSQL 的新手，我面临着有关表函数性能的问题。我需要做的是相当于 MSSQL 中的存储过程。经过一些研究，我发现表函数是可行的方法，所以我举了一个例子来使用 plpgsql 创建我的函数。
通过比较执行时间，使用函数比直接调用查询慢 2 倍(查询在函数中完全相同)。
经过一番挖掘，我发现在我的函数中使用 SQL 语言可以大大缩短执行时间(与调用查询的时间完全相同)。阅读本文后，我明白 plpgsql 增加了一点开销，但差异太大而无法解释。
因为我没有使用任何 plpgsql 功能，所以这个解决方案对我来说很好，完全有道理。但是，我想了解为什么会有这种差异。如果我比较执行计划，与执行 GroupAggregate 和一些预排序的 SQL 版本相比，plpgsql 版本执行一些 HashAggregate 和顺序搜索......我确实按照 Laurenz Albe 的建议使用了 auto_explain，我在最后添加了两个执行计划.
为什么相同查询的执行计划与语言的唯一区别如此不同？而且，即使是 SUM 的结果(见下面我的请求)也有显着差异。我知道我使用的是 float 值，因此每次调用之间的结果可能略有不同，但在这种情况下，查询和函数之间的差异约为 ~3，这是出乎意料的(~10001 vs ~9998)。
在使用 2 个表和 2 个函数重现问题的代码下方。
请注意，我使用的是 PostgreSQL 12。
任何解释表示赞赏:) 谢谢。

-- Step 1: Create database

-- Step 2: Create tables
-- table1
CREATE TABLE public.table1(area real, code text COLLATE pg_catalog."default");

-- table 2
CREATE TABLE public.table2(code text COLLATE pg_catalog."default" NOT NULL, surface real, CONSTRAINT table2_pkey PRIMARY KEY (code));

-- Step 3: create functions
-- plpgsql
CREATE OR REPLACE FUNCTION public.test_function()
    RETURNS TABLE(code text, value real) 
    LANGUAGE 'plpgsql'

    COST 100
    VOLATILE 
    ROWS 1000
AS $BODY$
BEGIN
   RETURN QUERY 
   
   SELECT table2.code, (case when (sum(area) * surface) IS NULL then 0 else (sum(area) * surface) end) AS value
   FROM table1 
   INNER JOIN table2 on table1.code = table2.code 
   GROUP BY table2.code, surface
   ;
END;
$BODY$;

-- sql
CREATE OR REPLACE FUNCTION public.test_function2()
  RETURNS TABLE(code text, value real) 
  LANGUAGE SQL
AS $BODY$
   SELECT table2.code, (case when (sum(area) * surface) IS NULL then 0 else (sum(area) * surface) end) AS value
   FROM table1 
   INNER JOIN table2 on table1.code = table2.code 
   GROUP BY table2.code, surface
$BODY$;

-- Step 4: insert some random data
-- table 2
INSERT INTO table2(code, surface) VALUES ('AAAAA', 1);
INSERT INTO table2(code, surface) VALUES ('BBBBB', 1);
INSERT INTO table2(code, surface) VALUES ('CCCCC', 1);
INSERT INTO table2(code, surface) VALUES ('DDDDD', 1);
INSERT INTO table2(code, surface) VALUES ('EEEEE', 1);

-- table1 (will take some time, this simulate my current query with 10 millions rows)
DO
$$
DECLARE random_code text;
DECLARE code_count int := (SELECT COUNT(*) FROM table2);
BEGIN
     FOR i IN 1..10000000 LOOP
        random_code := (SELECT code FROM table2 OFFSET floor(random() * code_count) LIMIT 1);       
        INSERT INTO public.table1(area, code) VALUES (random() / 100, random_code);
    END LOOP;
END
$$  

-- Step 5: compare
SELECT * FROM test_function()
SELECT * FROM test_function2() -- 2 times faster

test_function 的执行计划(plpgsql 版本)

2021-04-14 11:52:10.335 GMT [5056] LOG:  duration: 3808.919 ms  plan:
    Query Text: SELECT table2.code, (case when (sum(area) * surface) IS NULL then 0 else (sum(area) * surface) end) AS value
       FROM table1 
       INNER JOIN table2 on table1.code = table2.code 
       GROUP BY table2.code, surface
    HashAggregate  (cost=459899.03..459918.08 rows=1270 width=40) (actual time=3808.908..3808.913 rows=5 loops=1)
      Group Key: table2.code
      Buffers: shared hit=34 read=162130
      ->  Hash Join  (cost=38.58..349004.15 rows=14785984 width=40) (actual time=215.340..2595.247 rows=10000000 loops=1)
            Hash Cond: (table1.code = table2.code)
            Buffers: shared hit=34 read=162130
            ->  Seq Scan on table1  (cost=0.00..310022.84 rows=14785984 width=10) (actual time=215.294..1036.615 rows=10000000 loops=1)
                  Buffers: shared hit=33 read=162130
            ->  Hash  (cost=22.70..22.70 rows=1270 width=36) (actual time=0.019..0.020 rows=5 loops=1)
                  Buckets: 2048  Batches: 1  Memory Usage: 17kB
                  Buffers: shared hit=1
                  ->  Seq Scan on table2  (cost=0.00..22.70 rows=1270 width=36) (actual time=0.013..0.014 rows=5 loops=1)
                        Buffers: shared hit=1
2021-04-14 11:52:10.335 GMT [5056] CONTEXT:  PL/pgSQL function test_function() line 3 at RETURN QUERY

test_function2(sql版本)的执行计划

2021-04-14 11:54:24.122 GMT [5056] LOG:  duration: 1513.001 ms  plan:
    Query Text: 
       SELECT table2.code, (case when (sum(area) * surface) IS NULL then 0 else (sum(area) * surface) end) AS value
       FROM table1 
       INNER JOIN table2 on table1.code = table2.code 
       GROUP BY table2.code, surface
    
    Finalize GroupAggregate  (cost=271918.31..272252.77 rows=1270 width=40) (actual time=1484.846..1512.998 rows=5 loops=1)
      Group Key: table2.code
      Buffers: shared hit=96 read=162098
      ->  Gather Merge  (cost=271918.31..272214.67 rows=2540 width=40) (actual time=1484.840..1512.990 rows=15 loops=1)
            Workers Planned: 2
            Workers Launched: 2
            Buffers: shared hit=96 read=162098
            ->  Sort  (cost=270918.29..270921.46 rows=1270 width=40) (actual time=1435.897..1435.899 rows=5 loops=3)
                  Sort Key: table2.code
                  Sort Method: quicksort  Memory: 25kB
                  Worker 0:  Sort Method: quicksort  Memory: 25kB
                  Worker 1:  Sort Method: quicksort  Memory: 25kB
                  Buffers: shared hit=96 read=162098
                  ->  Partial HashAggregate  (cost=270840.11..270852.81 rows=1270 width=40) (actual time=1435.857..1435.863 rows=5 loops=3)
                        Group Key: table2.code
                        Buffers: shared hit=74 read=162098
                        ->  Hash Join  (cost=38.58..240035.98 rows=6160827 width=40) (actual time=204.916..1022.133 rows=3333333 loops=3)
                              Hash Cond: (table1.code = table2.code)
                              Buffers: shared hit=74 read=162098
                              ->  Parallel Seq Scan on table1  (cost=0.00..223771.27 rows=6160827 width=10) (actual time=204.712..486.850 rows=3333333 loops=3)
                                    Buffers: shared hit=65 read=162098
                              ->  Hash  (cost=22.70..22.70 rows=1270 width=36) (actual time=0.155..0.156 rows=5 loops=3)
                                    Buckets: 2048  Batches: 1  Memory Usage: 17kB
                                    Buffers: shared hit=3
                                    ->  Seq Scan on table2  (cost=0.00..22.70 rows=1270 width=36) (actual time=0.142..0.143 rows=5 loops=3)
                                          Buffers: shared hit=3
2021-04-14 11:54:24.122 GMT [5056] CONTEXT:  SQL function "test_function2" statement 1

最佳答案

一、一般讨论这种情况下如何获取执行计划
要深入了解，请激活 auto_explain并跟踪 postgresql.conf 中的函数执行情况:

shared_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = 0
auto_explain.log_analyze = on
auto_explain.log_buffers = on
auto_explain.log_nested_statements = on
track_functions = 'pl'

然后重启数据库。不要在繁忙的生产数据库上这样做，因为它会记录大量日志并增加相当大的开销!
使用以下命令重置数据库统计信息

SELECT pg_stat_reset();

现在，函数中所有 SQL 语句的执行计划都将被记录下来，并且 PostgreSQL 会跟踪函数执行时间。
查看从 SQL 函数和 PL/pgSQL 函数调用时语句的执行计划和执行时间，看看您是否能发现差异。然后比较pg_stat_user_functions中的执行时间来比较函数的执行时间。
在查看执行计划后，对当前案例的解释
从 PL/pgSQL 运行的查询不是并行化的。由于实现上的限制，查询使用 RETURN QUERY 运行从来没有。

关于PostgreSQL:与 sql 相比，plpgsql 语言的相同请求速度较慢，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/67083583/

PostgreSQL:与 sql 相比，plpgsql 语言的相同请求速度较慢

上一篇：angular - NPM 安装的应用程序(例如文件 C :\Users\name\AppData\Roaming\npm\ng. ps1)无法加载，因为在此系统上禁用了运行脚本

下一篇：amazon-s3 - Loki 配置与 s3