postgresql - Postgres 解释计划不同

我有 2 个数据库在同一台服务器上运行。两者都是开发数据库，结构相同，但数据量不同。在 Linux (Centos/Redhat) 上运行 Postgres 9.2

我在每个数据库上运行以下 SQL，但性能结果却截然不同。

注意:每个数据库中的 write_history 表结构都是相同的，并且具有相同的索引/约束。

这是 SQL:

 explain analyze SELECT * FROM write_history WHERE fk_device_rw_id =
 'd2c969b8-2609-11e3-80ca-0750cff1e96c'  AND fk_write_history_status_id
 = 5  ORDER BY update_time DESC LIMIT 1 ;

以及每个数据库的解释计划:

DB1 - PreProd

 Limit  (cost=57902.39..57902.39 rows=1 width=103) (actual
 time=0.056..0.056 rows=0 loops=1)'   ->  Sort 
 (cost=57902.39..57908.69 rows=2520 width=103) (actual
 time=0.053..0.053 rows=0 loops=1)'
         Sort Key: update_time'
         Sort Method: quicksort  Memory: 25kB'
         ->  Bitmap Heap Scan on write_history  (cost=554.04..57889.79 rows=2520 width=103) (actual time=0.033..0.033 rows=0 loops=1)'
               Recheck Cond: (fk_device_rw_id = 'd2c969b8-2609-11e3-80ca-0750cff1e96c'::uuid)'
               Filter: (fk_write_history_status_id = 5)'
               ->  Bitmap Index Scan on idx_write_history_fk_device_rw_id  (cost=0.00..553.41 rows=24034
 width=0) (actual time=0.028..0.028 rows=0 loops=1)'
                     Index Cond: (fk_device_rw_id = 'd2c969b8-2609-11e3-80ca-0750cff1e96c'::uuid)' Total runtime: 0.112 ms'

DB2 - 质量检查

 Limit  (cost=50865.41..50865.41 rows=1 width=108) (actual
 time=545.521..545.523 rows=1 loops=1)'   ->  Sort 
 (cost=50865.41..50916.62 rows=20484 width=108) (actual
 time=545.518..545.518 rows=1 loops=1)'
         Sort Key: update_time'
         Sort Method: top-N heapsort  Memory: 25kB'
         ->  Bitmap Heap Scan on write_history  (cost=1431.31..50762.99 rows=20484 width=108) (actual time=21.931..524.034 rows=22034
 loops=1)'
               Recheck Cond: (fk_device_rw_id = 'd2cd81a6-2609-11e3-b574-47328bfa4c38'::uuid)'
               Rows Removed by Index Recheck: 1401986'
               Filter: (fk_write_history_status_id = 5)'
               Rows Removed by Filter: 40161'
               ->  Bitmap Index Scan on idx_write_history_fk_device_rw_id  (cost=0.00..1426.19 rows=62074
 width=0) (actual time=19.167..19.167 rows=62195 loops=1)'
                     Index Cond: (fk_device_rw_id = 'd2cd81a6-2609-11e3-b574-47328bfa4c38'::uuid)' Total runtime: 545.588 ms'

几个问题:

“排序方法:快速排序内存:25kB”和“排序方法:top-N 堆排序内存:25kB”有什么区别？
总运行时间如此不同的原因可能是什么？

表格行数:

DB1:write_history 行数:5,863,565

DB2:write_history 行数:2,670,888

如果需要更多信息，请告诉我。感谢您的帮助。

最佳答案

top-N 排序意味着它正在支持 ORDER BY ... LIMIT N 排序，并且一旦它可以显示元组不能位于前 N 中，它将丢弃任何元组。切换的决定前 N 排序是在排序过程中动态进行的。由于该排序的输入元组为零，因此它从未决定切换到它。因此，报告方法的差异是结果，而不是原因；对于本例来说并不重要。

我认为对你来说关键是位图堆扫描:

(actual time=0.033..0.033 rows=0 loops=1)

(actual time=21.931..524.034 rows=22034 loops=1)

较小的数据库有更多更多行符合您的条件，因此还有更多工作要做。

此外，需要完成的重新检查工作量:

Rows Removed by Index Recheck: 1401986

建议您将 work_mem 设置为非常小的 work_mem 值，因此您的位图会溢出。

关于postgresql - Postgres 解释计划不同，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/24737291/

postgresql - Postgres 解释计划不同

上一篇：postgresql - CentOS 上的 PostgreSQL 和 ESRI Geoportal 安装问题

下一篇：sql - 将缺失的信息添加到表中？ (考虑随机开始和结束月份)