sql - 硬编码函数参数产生 5 倍的加速

我有以下存储过程来生成动态查询。

给定条件/过滤器列表，它会找到属于给定 App 的所有 Visitors。 app_id 作为参数传入。

如果我使用应用程序 ID 调用该函数，并在动态查询中使用此参数，它会在大约 200 毫秒内运行。

但是，如果我对 app_id 进行硬编码，它会在 < 20 毫秒内运行。

这里是我如何调用过程的例子

SELECT id
FROM find_matching_visitors('my_app_id', '{}', '{( field = ''app_name'' and string_value ILIKE ''My awesome app''  )}')

关于为什么的任何想法？

    CREATE OR REPLACE FUNCTION find_matching_visitors(app_id text, default_filters text[], custom_filters text[])
    RETURNS TABLE (
      id varchar
    ) AS
    $body$
    DECLARE
        default_filterstring text;
        custom_filterstring text;
        default_filter_length integer;
        custom_filter_length integer;
        sql VARCHAR;
    BEGIN
        default_filter_length := COALESCE(array_length(default_filters, 1), 0);
        custom_filter_length := COALESCE(array_length(custom_filters, 1), 0);

        default_filterstring := array_to_string(default_filters, ' AND ');
        custom_filterstring := array_to_string(custom_filters, ' OR ');

        IF custom_filterstring = '' or custom_filterstring is null THEN
            custom_filterstring := '1=1';
        END IF;

        IF default_filterstring = '' or default_filterstring is null THEN
            default_filterstring := '1=1';
        END IF;

        sql := format('
                    SELECT v.id FROM visitors v
                    LEFT JOIN trackings t on v.id = t.visitor_id
                    WHERE v.app_id = ''HARDCODED_APP_ID'' and (%s) and (%s)
                    group by v.id

                ', custom_filterstring, default_filterstring, custom_filter_length, custom_filter_length);
        RETURN QUERY EXECUTE sql;

    END;
    $body$
    LANGUAGE 'plpgsql';

无需硬编码 app_id 的分析

Limit  (cost=481.86..481.99 rows=50 width=531) (actual time=163.579..163.581 rows=9 loops=1)
2     ->  Sort  (cost=481.86..484.26 rows=960 width=531) (actual time=163.578..163.579 rows=9 loops=1)
3           Sort Key: v0.last_seen DESC
4           Sort Method: quicksort  Memory: 30kB
5           ->  WindowAgg  (cost=414.62..449.97 rows=960 width=531) (actual time=163.553..163.560 rows=9 loops=1)
6                 ->  Hash Join  (cost=414.62..437.97 rows=960 width=523) (actual time=163.525..163.537 rows=9 loops=1)
7                       Hash Cond: ((find_matching_visitors.id)::text = (v0.id)::text)
8                       ->  Function Scan on find_matching_visitors  (cost=0.25..10.25 rows=1000 width=32) (actual time=153.918..153.918 rows=9 loops=1)
9                       ->  Hash  (cost=354.19..354.19 rows=4814 width=523) (actual time=9.578..9.578 rows=4887 loops=1)
10                            Buckets: 8192  Batches: 1  Memory Usage: 2145kB
11                            ->  Seq Scan on visitors v0  (cost=0.00..354.19 rows=4814 width=523) (actual time=0.032..4.993 rows=4887 loops=1)
12                                  Filter: ((NOT merged) AND (((type)::text = 'user'::text) OR ((type)::text = 'lead'::text)))
13                                  Rows Removed by Filter: 138
14  Planning time: 1.134 ms
15  Execution time: 163.705 ms

硬编码app_id时分析

Limit  (cost=481.86..481.99 rows=50 width=531) (actual time=25.890..25.893 rows=9 loops=1)
2     ->  Sort  (cost=481.86..484.26 rows=960 width=531) (actual time=25.888..25.890 rows=9 loops=1)
3           Sort Key: v0.last_seen DESC
4           Sort Method: quicksort  Memory: 30kB
5           ->  WindowAgg  (cost=414.62..449.97 rows=960 width=531) (actual time=25.862..25.870 rows=9 loops=1)
6                 ->  Hash Join  (cost=414.62..437.97 rows=960 width=523) (actual time=25.830..25.841 rows=9 loops=1)
7                       Hash Cond: ((find_matching_visitors.id)::text = (v0.id)::text)
8                       ->  Function Scan on find_matching_visitors  (cost=0.25..10.25 rows=1000 width=32) (actual time=15.875..15.876 rows=9 loops=1)
9                       ->  Hash  (cost=354.19..354.19 rows=4814 width=523) (actual time=9.936..9.936 rows=4887 loops=1)
10                            Buckets: 8192  Batches: 1  Memory Usage: 2145kB
11                            ->  Seq Scan on visitors v0  (cost=0.00..354.19 rows=4814 width=523) (actual time=0.013..5.232 rows=4887 loops=1)
12                                  Filter: ((NOT merged) AND (((type)::text = 'user'::text) OR ((type)::text = 'lead'::text)))
13                                  Rows Removed by Filter: 138
14  Planning time: 0.772 ms
15  Execution time: 26.006 ms

更新 1:为两种情况添加了说明。注意:它们实际上是完全相同的计划，只是花费的时间不同

更新 2:事实证明，我需要将 app_id 作为参数传递给格式函数，而不是直接将其嵌入。这使查询时间减少到大约 20/30 毫秒

最佳答案

硬编码值对于确定最佳查询计划很重要。例如:

select * from some_table where id_person=231
select * from some_table where id_person=10

当 90% 的 some_table 有 id_person=231 pg 使用全表扫描，因为那是最快的。当 1% 的记录具有 id_person=10 时，它使用索引扫描。所以使用的计划取决于参数的值。

当您使用非硬编码值时，例如

select * from some_table where id_person=?

它无法确定最佳查询计划，查询速度可能会变慢。

关于sql - 硬编码函数参数产生 5 倍的加速，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/40526608/

sql - 硬编码函数参数产生 5 倍的加速

上一篇：postgresql - 您将如何在存储过程中读取 csv 以便 csv 需要数据提取？

下一篇：SQL:使用一个表中的数据将新数据插入到另一个表中