database - Postgres Materialize 导致删除查询性能不佳

标签 database postgresql query-performance materialized-views sql-delete

我有一个 DELETE 查询需要在 PostgreSQL 9.0.4 上运行。我发现它在子选择查询中达到 524,289 行之前是高性能的。

例如,在 524,288 处没有使用物化 View ,成本看起来相当不错:

explain DELETE FROM table1 WHERE pointLevel = 0 AND userID NOT IN
(SELECT userID FROM table2 fetch first 524288 rows only);
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Delete  (cost=13549.49..17840.67 rows=21 width=6)
   ->  Index Scan using jslps_userid_nopt on table1  (cost=13549.49..17840.67 rows=21 width=6)
         Filter: ((NOT (hashed SubPlan 1)) AND (pointlevel = 0))
         SubPlan 1
           ->  Limit  (cost=0.00..12238.77 rows=524288 width=8)
                 ->  Seq Scan on table2  (cost=0.00..17677.92 rows=757292 width=8)
(6 rows)

However, as soon as I hit 524,289, the materialized view comes into play and the DELETE query becomes much more costly:

explain DELETE FROM table1 WHERE pointLevel = 0 AND userID NOT IN
(SELECT userID FROM table2 fetch first 524289 rows only);

  QUERY PLAN

-----------------------------------------------------------------------------------------------------------  
Delete  (cost=0.00..386910.33 rows=21 width=6)
    ->  Index Scan using jslps_userid_nopt on table1  (cost=0.00..386910.33 rows=21 width=6)
         Filter: ((pointlevel = 0) AND (NOT (SubPlan 1)))
         SubPlan 1
           ->  Materialize  (cost=0.00..16909.24 rows=524289 width=8)
                 ->  Limit  (cost=0.00..12238.79 rows=524289 width=8)
                       ->  Seq Scan on table2  (cost=0.00..17677.92 rows=757292 width=8) (7 rows)

I worked around the issue by using a JOIN in the sub-select query instead:

SELECT s.userid 
FROM table1 s 
LEFT JOIN table2 p ON s.userid=p.userid
WHERE p.userid IS NULL AND s.pointlevel=0

但是,我仍然有兴趣了解为什么物化会如此显着地降低性能。

最佳答案

我的猜测是在 rows=524289 处内存缓冲区已满,因此必须在磁盘上具体化子查询。因此所需时间急剧增加。

在这里您可以阅读更多关于配置内存缓冲区的信息:http://www.postgresql.org/docs/9.1/static/runtime-config-resource.html
如果您使用 work_mem,您将看到查询行为的不同。

然而,在子查询中使用连接是加快查询速度的更好方法,因为您限制了源本身的行数,而不是简单地选择第一个 XYZ 行然后执行检查。

关于database - Postgres Materialize 导致删除查询性能不佳,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26477353/

相关文章:

mysql - 两个连接查询需要 540 秒才能运行 - 我怎样才能加快速度?

SQL Server 查询优化 - 简单查询中意外的缓慢

python - SQL 中的动态列格式 - 以及存储格式的后端

database - 如何修复无法在分布式事务中启用 Sybase 数据库的错误?

database - 检索按 PostgreSQL 的 Ltree 模块下的列排序的完整层次结构

postgresql - 如何插入到 Postgresql 几何列

java - jsp/servlets 从数组中填充下拉框

java - 在 preparedStatement 中编辑查询

java - Postgresql 触发器约束

MySQL 查询在 Where MONTH(datetime) 中变慢