mysql - 低基数索引仍然会减慢查询速度

将 MySQL 5.5 与 InnoDB 结合使用。有一个类似的查询

    SELECT
        count(distinct a.thing_id) as new_thing_count,
        sum(b.price) as new_thing_spend
    FROM thing ii
    LEFT OUTER JOIN thing a
        ON a.customer_id = ii.customer_id
        AND a.created_at >= '2013-01-01'
        AND a.created_at <= '2013-03-31'
    JOIN whatsit b
        ON b.whatsit_id = a.original_whatsit_id
    WHERE ii.customer_id = 3

在哪里

thing 的基数约为 25k，其中 3.5k 属于客户 3
有 12 个可能的 customer_id

现在，当我使用 customer_id 上的索引运行此查询时，大约需要 10 秒。当我删除索引时，它需要 0.03 秒。

我不明白为什么会这样。这是没有索引的解释结果:

1   SIMPLE  ii  ALL                 24937   Using where
1   SIMPLE  a   ALL                 24937   Using where; Using join buffer
1   SIMPLE  b   eq_ref  PRIMARY PRIMARY 4   db.a.original_whatsit_id    1

这里是索引 (thing_customer)

1   SIMPLE  ii  ref thing_customer  thing_customer  4   const   3409    Using index
1   SIMPLE  a   ref thing_customer  thing_customer  4   const   3409    Using where
1   SIMPLE  b   eq_ref  PRIMARY PRIMARY 4   db.a.original_whatsit_id    1

有人能帮我解释一下为什么这个索引在逻辑上看起来不应该的情况下会减慢速度吗？

最佳答案

当您的数据库引擎决定读取索引时，它会按顺序逐行读取。这会导致它在磁盘第 2 页读取一行，在第 4 页读取另一行，在第 1 页读取另一行，在第 2 页读取下一行，等等。

有时，来回移动使得索引无济于事——恰恰相反。

如果数据库引擎在生成查询计划时在收集和分析表统计信息方面做得不好，它可能无法识别索引产生完全碎片化的磁盘读取。这可能就是您正在经历的。

尝试分析表格以收集新的统计信息:

http://dev.mysql.com/doc/refman/5.5/en/analyze-table.html

然后在有和没有索引的情况下重试。

关于mysql - 低基数索引仍然会减慢查询速度，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/16367706/

mysql - 低基数索引仍然会减慢查询速度

上一篇：PHP - uniqid() 会生成字母 ID 吗？

下一篇：php - 使用 php 在 mysql 数据库中跟踪单个客户的多个查询