sql - 如何优化 Django 生成的这个非常慢的查询?

标签 sql django postgresql query-optimization

这是我的 Django ORM 查询:

Group.objects.filter(public = True)\
    .annotate(num_members = Count('members', distinct = True))\
    .annotate(num_images = Count('images', distinct = True))\
    .order_by(sort)

不幸的是,即使只有几十个 Groups,这也需要 30 多秒的时间。删除 annotate 语句使查询速度显着加快,仅需 3 毫秒...

我的数据库后端是 Postgres,这里是 SQL 和 explain:

Executed SQL
SELECT ••• FROM "astrobin_apps_groups_group"
LEFT OUTER JOIN "astrobin_apps_groups_group_members" ON (
    "astrobin_apps_groups_group"."id" = "astrobin_apps_groups_group_members"."group_id"
)
LEFT OUTER JOIN "astrobin_apps_groups_group_images" ON (
"astrobin_apps_groups_group"."id" = "astrobin_apps_groups_group_images"."group_id")
WHERE "astrobin_apps_groups_group"."public" = true
GROUP BY
    "astrobin_apps_groups_group"."id", 
    "astrobin_apps_groups_group"."date_created", 
    "astrobin_apps_groups_group"."date_updated", 
    "astrobin_apps_groups_group"."creator_id", 
    "astrobin_apps_groups_group"."owner_id", 
    "astrobin_apps_groups_group"."name", 
    "astrobin_apps_groups_group"."description", 
    "astrobin_apps_groups_group"."category", 
    "astrobin_apps_groups_group"."public", 
    "astrobin_apps_groups_group"."moderated", 
    "astrobin_apps_groups_group"."autosubmission", 
    "astrobin_apps_groups_group"."forum_id"
ORDER BY "astrobin_apps_groups_group"."date_updated" ASC

Time
30455.9268951 ms


QUERY PLAN
GroupAggregate  (cost=5910.49..8288.54 rows=216 width=242) (actual time=29255.329..30269.284 rows=27 loops=1)
  ->  Sort  (cost=5910.49..6068.88 rows=63357 width=242) (actual time=29253.278..29788.601 rows=201888 loops=1)
        Sort Key: astrobin_apps_groups_group.date_updated, astrobin_apps_groups_group.id, astrobin_apps_groups_group.date_created, astrobin_apps_groups_group.creator_id, astrobin_apps_groups_group.owner_id, astrobin_apps_groups_group.name, astrobin_apps_groups_group.description, astrobin_apps_groups_group.category, astrobin_apps_groups_group.public, astrobin_apps_groups_group.moderated, astrobin_apps_groups_group.autosubmission, astrobin_apps_groups_group.forum_id
        Sort Method: external merge  Disk: 70176kB
        ->  Hash Right Join  (cost=15.69..857.39 rows=63357 width=242) (actual time=1.903..397.613 rows=201888 loops=1)
              Hash Cond: (astrobin_apps_groups_group_images.group_id = astrobin_apps_groups_group.id)
              ->  Seq Scan on astrobin_apps_groups_group_images  (cost=0.00..106.05 rows=6805 width=8) (actual time=0.024..12.510 rows=6837 loops=1)
              ->  Hash  (cost=12.31..12.31 rows=270 width=238) (actual time=1.853..1.853 rows=323 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 85kB
                    ->  Hash Right Join  (cost=3.63..12.31 rows=270 width=238) (actual time=0.133..1.252 rows=323 loops=1)
                          Hash Cond: (astrobin_apps_groups_group_members.group_id = astrobin_apps_groups_group.id)
                          ->  Seq Scan on astrobin_apps_groups_group_members  (cost=0.00..4.90 rows=290 width=8) (actual time=0.004..0.348 rows=333 loops=1)
                          ->  Hash  (cost=3.29..3.29 rows=27 width=234) (actual time=0.103..0.103 rows=27 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 7kB
                                ->  Seq Scan on astrobin_apps_groups_group  (cost=0.00..3.29 rows=27 width=234) (actual time=0.004..0.049 rows=27 loops=1)
                                      Filter: public
Total runtime: 30300.606 ms

如果有人可以建议一种优化方法,那就太好了。我觉得我错过了一个非常容易实现的目标。

谢谢!

最佳答案

  1. astrobin_apps_groups_group 和“astrobin_apps_groups_group_member, astrobin_apps_groups_group_image 表”中存在哪些索引?
  2. 您的选择中是否使用了诸如 SUM、COUNT 之类的聚合函数?如果不是,那么您可以从 GROUP BY 中删除所有列
  3. 计划显示大部分时间用于排序。如果您在 date_updated 上创建一个索引,该索引以 NULLS LAST 归档,并且索引中的最新值排在第一位,那么规划器可能会使用该索引进行排序。
  4. 对于排序,磁盘正在被使用,这是最昂贵的事情。这是因为您为排序而收集的数据不适合内存。尝试增加 WORK_MEM - set work_mem='10MB';选择.....

关于sql - 如何优化 Django 生成的这个非常慢的查询?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39544228/

相关文章:

python - django-session-security session 不会过期

python - 如何使用元类作为 django.contrib.auth.models.User 创建注册表单(ModelForm)?

python - django模型获取字段方法

sql - AVG的AVG,子查询的聚合函数

database - PostgreSQL - 检查两个属性中的值是否相等

java - hibernate + PostgreSQL : Null boolean value being returned as 'true'

c++ - 构建过程中的 QT 和 SQLITE 问题

mysql - 如何在 SQLite 中使用 CompareTo() 的功能?

php - 如何在codeigniter中从db中获取数据

mysql - Left Outer Join 不返回我左表中的所有行?