这是我的 Django ORM 查询:
Group.objects.filter(public = True)\
.annotate(num_members = Count('members', distinct = True))\
.annotate(num_images = Count('images', distinct = True))\
.order_by(sort)
不幸的是,即使只有几十个 Groups
,这也需要 30 多秒的时间。删除 annotate
语句使查询速度显着加快,仅需 3 毫秒...
我的数据库后端是 Postgres,这里是 SQL 和 explain
:
Executed SQL
SELECT ••• FROM "astrobin_apps_groups_group"
LEFT OUTER JOIN "astrobin_apps_groups_group_members" ON (
"astrobin_apps_groups_group"."id" = "astrobin_apps_groups_group_members"."group_id"
)
LEFT OUTER JOIN "astrobin_apps_groups_group_images" ON (
"astrobin_apps_groups_group"."id" = "astrobin_apps_groups_group_images"."group_id")
WHERE "astrobin_apps_groups_group"."public" = true
GROUP BY
"astrobin_apps_groups_group"."id",
"astrobin_apps_groups_group"."date_created",
"astrobin_apps_groups_group"."date_updated",
"astrobin_apps_groups_group"."creator_id",
"astrobin_apps_groups_group"."owner_id",
"astrobin_apps_groups_group"."name",
"astrobin_apps_groups_group"."description",
"astrobin_apps_groups_group"."category",
"astrobin_apps_groups_group"."public",
"astrobin_apps_groups_group"."moderated",
"astrobin_apps_groups_group"."autosubmission",
"astrobin_apps_groups_group"."forum_id"
ORDER BY "astrobin_apps_groups_group"."date_updated" ASC
Time
30455.9268951 ms
QUERY PLAN
GroupAggregate (cost=5910.49..8288.54 rows=216 width=242) (actual time=29255.329..30269.284 rows=27 loops=1)
-> Sort (cost=5910.49..6068.88 rows=63357 width=242) (actual time=29253.278..29788.601 rows=201888 loops=1)
Sort Key: astrobin_apps_groups_group.date_updated, astrobin_apps_groups_group.id, astrobin_apps_groups_group.date_created, astrobin_apps_groups_group.creator_id, astrobin_apps_groups_group.owner_id, astrobin_apps_groups_group.name, astrobin_apps_groups_group.description, astrobin_apps_groups_group.category, astrobin_apps_groups_group.public, astrobin_apps_groups_group.moderated, astrobin_apps_groups_group.autosubmission, astrobin_apps_groups_group.forum_id
Sort Method: external merge Disk: 70176kB
-> Hash Right Join (cost=15.69..857.39 rows=63357 width=242) (actual time=1.903..397.613 rows=201888 loops=1)
Hash Cond: (astrobin_apps_groups_group_images.group_id = astrobin_apps_groups_group.id)
-> Seq Scan on astrobin_apps_groups_group_images (cost=0.00..106.05 rows=6805 width=8) (actual time=0.024..12.510 rows=6837 loops=1)
-> Hash (cost=12.31..12.31 rows=270 width=238) (actual time=1.853..1.853 rows=323 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 85kB
-> Hash Right Join (cost=3.63..12.31 rows=270 width=238) (actual time=0.133..1.252 rows=323 loops=1)
Hash Cond: (astrobin_apps_groups_group_members.group_id = astrobin_apps_groups_group.id)
-> Seq Scan on astrobin_apps_groups_group_members (cost=0.00..4.90 rows=290 width=8) (actual time=0.004..0.348 rows=333 loops=1)
-> Hash (cost=3.29..3.29 rows=27 width=234) (actual time=0.103..0.103 rows=27 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 7kB
-> Seq Scan on astrobin_apps_groups_group (cost=0.00..3.29 rows=27 width=234) (actual time=0.004..0.049 rows=27 loops=1)
Filter: public
Total runtime: 30300.606 ms
如果有人可以建议一种优化方法,那就太好了。我觉得我错过了一个非常容易实现的目标。
谢谢!
最佳答案
- astrobin_apps_groups_group 和“astrobin_apps_groups_group_member, astrobin_apps_groups_group_image 表”中存在哪些索引?
- 您的选择中是否使用了诸如 SUM、COUNT 之类的聚合函数?如果不是,那么您可以从 GROUP BY 中删除所有列
- 计划显示大部分时间用于排序。如果您在 date_updated 上创建一个索引,该索引以 NULLS LAST 归档,并且索引中的最新值排在第一位,那么规划器可能会使用该索引进行排序。
- 对于排序,磁盘正在被使用,这是最昂贵的事情。这是因为您为排序而收集的数据不适合内存。尝试增加
WORK_MEM
- set work_mem='10MB';选择.....
关于sql - 如何优化 Django 生成的这个非常慢的查询?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39544228/