sql - DELETE with NOT IN (SELECT ...) 的性能

标签 sql postgresql sql-delete postgresql-performance

我有这两个表,想从 ms_author 中删除 author 中不存在的所有作者。

作者(160 万行)

+-------+-------------+------+-----+-------+
| Field | Type        | Null | Key | index |
+-------+-------------+------+-----+-------+
| id    | text        | NO   | PRI | true  |
| name  | text        | YES  |     |       |
+-------+-------------+------+-----+-------+

ms_author(1.2 亿行)

+-------+-------------+------+-----+-------+
| Field | Type        | Null | Key | index |
+-------+-------------+------+-----+-------+
| id    | text        | NO   | PRI |       |
| name  | text        | YES  |     | true  |
+-------+-------------+------+-----+-------+

这是我的查询:

    DELETE
FROM ms_author AS m
WHERE m.name NOT IN
                   (SELECT a.name
                    FROM author AS a);

我试着估计查询持续时间:~ 130 小时。
有没有更快的方法来实现这一点?

编辑:

EXPLAIN VERBOSE 输出

Delete on public.ms_author m  (cost=0.00..2906498718724.75 rows=59946100 width=6)"
  ->  Seq Scan on public.ms_author m  (cost=0.00..2906498718724.75 rows=59946100 width=6)"
        Output: m.ctid"
        Filter: (NOT (SubPlan 1))"
        SubPlan 1"
          ->  Materialize  (cost=0.00..44334.43 rows=1660295 width=15)"
                Output: a.name"
                ->  Seq Scan on public.author a  (cost=0.00..27925.95 rows=1660295 width=15)"
                      Output: a.name"

索引作者(姓名):

create index author_name on author(name);

索引 ms_author(name):

create index ms_author_name on ms_author(name);

最佳答案

我是“反加入”的忠实拥护者。这对大型和小型数据集都有效:

delete from ms_author ma
where not exists (
  select null
  from author a
  where ma.name = a.name
)

关于sql - DELETE with NOT IN (SELECT ...) 的性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34263149/

相关文章:

sql - 如何有效地查询具有修订值的表?

sql - 生产BoM爆破演练幻影BoM

PostgreSQL 触发器错误 : control reached end of trigger procedure without RETURN

MySQL delete语句基于sub-select有多个返回值同时为每个唯一值保留500条记录

sql - 从 SQL Server 表中删除重复项

python - Django 模型按 2 个以上元素分组

mysql - mysql查询中单选的指定条件

ruby-on-rails - PG::StringDataRightTruncation:错误:对于类型字符变化的值太长(255)

python - 使用多个外键

sql - PostgreSQL 删除等于以逗号分隔的字符串的行