这里我有一个查询场景,在行内注释中解释了范围:
select
-- selecting both entity ids
entity_a.id as entity_a_id,
entity_b.id as entity_b_id,
concat(entity_a.id, entity_b.id) as `key`
from `entity_b`
-- Following are few one to many relations to match entity a with b
inner join `entity_b_function` on
`entity_b`.`id` = `entity_b_function`.`entity_b_id`
inner join `entity_b_category` on
`entity_b`.`id` = `entity_b_category`.`entity_b_id`
inner join `entity_b_scope` on
`entity_b`.`id` = `entity_b_scope`.`entity_b_id`
inner join `entity_a` on
`entity_a`.`category_id` = `entity_b_category`.`category_id` and
`entity_a`.`scope_id` = `entity_b_scope`.`scope_id`
inner join `entity_a_function` on
`entity_b_function`.`function_id` = `entity_a_function`.`function_id`
-- pivot of entity a and b
-- making sure matching entities are finally related in pivot
left join `entity_a_b_pivot` on
`entity_a_b_pivot`.`entity_a_id` = `entity_a`.`id` and
`entity_a_b_pivot`.`entity_b_id` = `entity_b`.`id`
where
-- we need only matching entities which are not yet related in pivot
`entity_a_b_pivot`.`id` is null and
-- when both entities are active in the system
`entity_b`.`status` = 1 and
`entity_a`.`status` = 1
LIMIT 5000;
目前结果如下:
(由于一对多关系的连接,指向的项目是重复的)
entity_a_id, entity_b_id key
1 1 11
> 1 1 11
1 2 12
2 1 21
2 2 22
> 2 2 22
在这里,如果我使用 GROUP BY key
或 DISTINCT(key)
来消除重复项,查询处理将永远卡在 100% CPU 使用率但没有这些它返回 5K 条记录只是眨眼的功夫,但有 90% 的重复记录。
如何针对不同的结果优化查询?
最佳答案
在选择列表的开头添加 DISTINCT
怎么样?
select
-- selecting both entity ids
distinct
entity_a.id as entity_a_id,
entity_b.id as entity_b_id,
concat(entity_a.id, entity_b.id) as `key`
from `entity_b`
-- Following are few one to many relations to match entity a with b
inner join `entity_b_function` on
`entity_b`.`id` = `entity_b_function`.`entity_b_id`
inner join `entity_b_category` on
`entity_b`.`id` = `entity_b_category`.`entity_b_id`
inner join `entity_b_scope` on
`entity_b`.`id` = `entity_b_scope`.`entity_b_id`
inner join `entity_a` on
`entity_a`.`category_id` = `entity_b_category`.`category_id` and
`entity_a`.`scope_id` = `entity_b_scope`.`scope_id`
inner join `entity_a_function` on
`entity_b_function`.`function_id` = `entity_a_function`.`function_id`
-- pivot of entity a and b
-- making sure matching entities are finally related in pivot
left join `entity_a_b_pivot` on
`entity_a_b_pivot`.`entity_a_id` = `entity_a`.`id` and
`entity_a_b_pivot`.`entity_b_id` = `entity_b`.`id`
where
-- we need only matching entities which are not yet related in pivot
`entity_a_b_pivot`.`id` is null and
-- when both entities are active in the system
`entity_b`.`status` = 1 and
`entity_a`.`status` = 1
LIMIT 5000;
关于mysql - 使用 group by 或 distinct 删除重复项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36236922/