mysql - 高效的 MySQL 查询查找 A 中与 B 不匹配的条目

标签 mysql join left-join outer-join ubercart

我有几个表(产品和供应商),想要找出供应商表中不再列出哪些项目。

表 uc_products 包含产品。表 uc_supplier_csv 包含供应商库存。 uc_products.model 加入针对 uc_suppliers.sku。

在尝试识别产品表中的库存(供应商表中未提及)时,我看到很长的查询。我只想提取匹配条目的nidsid IS NULL 只是为了让我可以识别哪些项目没有供应商。

对于下面的第一个查询,数据库服务器(4GB RAM/2x 2.4GHz intel)需要一个小时才能获得结果(507 行)。我没有等到第二个查询完成。

如何使该查询更加优化?是因为字符集不匹配吗?

我认为以下是最有效的 SQL 使用方式:

         SELECT nid, sid 
           FROM uc_products p
LEFT OUTER JOIN uc_supplier_csv c
             ON p.model = c.sku
         WHERE sid IS NULL ;

对于此查询,我得到以下 EXPLAIN 结果:

mysql> EXPLAIN SELECT nid, sid FROM uc_products p LEFT OUTER JOIN uc_supplier_csv c ON p.model = c.sku WHERE sid IS NULL;
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows   | Extra                   |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
|  1 | SIMPLE      | p     | ALL  | NULL          | NULL | NULL    | NULL |   6526 |                         | 
|  1 | SIMPLE      | c     | ALL  | NULL          | NULL | NULL    | NULL | 126639 | Using where; Not exists | 
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
2 rows in set (0.00 sec)

我本以为键 idx_sku 和 idx_model 在这里使用是有效的,但事实并非如此。这是因为表的默认字符集不匹配吗?一种是 UTF-8,一种是 latin1。

我也考虑过这种形式:

SELECT nid 
  FROM uc_products 
 WHERE model 
NOT IN ( 
         SELECT DISTINCT sku FROM uc_supplier_csv 
       ) ;

EXPLAIN 显示该查询的以下结果:

mysql> explain select nid from uc_products where model not in ( select sku from uc_supplier_csv ) ;
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| id | select_type        | table           | type  | possible_keys         | key     | key_len | ref  | rows   | Extra                    |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
|  1 | PRIMARY            | uc_products     | ALL   | NULL                  | NULL    | NULL    | NULL |   6520 | Using where              | 
|  2 | DEPENDENT SUBQUERY | uc_supplier_csv | index | idx_sku,idx_sku_stock | idx_sku | 258     | NULL | 126639 | Using where; Using index | 
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)

为了不错过任何事情,这里有一些更令人兴奋的细节:表格大小和统计数据,以及表格结构:)

mysql> show table status where Name in ( 'uc_supplier_csv', 'uc_products' ) ;
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| Name            | Engine | Version | Row_format | Rows   | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time          | Collation         | Checksum | Create_options | Comment |
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| uc_products     | MyISAM |      10 | Dynamic    |   6520 |             89 |      585796 | 281474976710655 |       232448 |       912 |           NULL | 2009-04-24 11:03:15 | 2009-10-12 14:23:43 | 2009-04-24 11:03:16 | utf8_general_ci   |     NULL |                |         | 
| uc_supplier_csv | MyISAM |      10 | Dynamic    | 126639 |             26 |     3399704 | 281474976710655 |      5864448 |         0 |           NULL | 2009-10-12 14:28:25 | 2009-10-12 14:28:25 | 2009-10-12 14:28:27 | latin1_swedish_ci |     NULL |                |         | 
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+

CREATE TABLE `uc_products` (
  `vid` mediumint(9) NOT NULL default '0',
  `nid` mediumint(9) NOT NULL default '0',
  `model` varchar(255) NOT NULL default '',
  `list_price` decimal(10,2) NOT NULL default '0.00',
  `cost` decimal(10,2) NOT NULL default '0.00',
  `sell_price` decimal(10,2) NOT NULL default '0.00',
  `weight` float NOT NULL default '0',
  `weight_units` varchar(255) NOT NULL default 'lb',
  `length` float unsigned NOT NULL default '0',
  `width` float unsigned NOT NULL default '0',
  `height` float unsigned NOT NULL default '0',
  `length_units` varchar(255) NOT NULL default 'in',
  `pkg_qty` smallint(5) unsigned NOT NULL default '1',
  `default_qty` smallint(5) unsigned NOT NULL default '1',
  `unique_hash` varchar(32) NOT NULL,
  `ordering` tinyint(2) NOT NULL default '0',
  `shippable` tinyint(2) NOT NULL default '1',
  PRIMARY KEY  (`vid`),
  KEY `idx_model` (`model`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 

CREATE TABLE `uc_supplier_csv` (
  `sid` int(10) unsigned NOT NULL default '0',
  `sku` varchar(255) default NULL,
  `stock` int(10) unsigned NOT NULL default '0',
  `list_price` decimal(8,2) default '0.00',
  KEY `idx_sku` (`sku`),
  KEY `idx_stock` (`stock`),
  KEY `idx_sku_stock` (`sku`,`stock`),
  KEY `idx_sid` (`sid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 

编辑:为下面马丁建议的几个查询添加查询计划:

mysql> explain SELECT nid FROM uc_products p WHERE NOT EXISTS ( SELECT 1 FROM uc_supplier_csv c WHERE p.model = c.sku ) ;
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type        | table | type  | possible_keys | key     | key_len | ref  | rows   | Extra                    |
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
|  1 | PRIMARY            | p     | ALL   | NULL          | NULL    | NULL    | NULL |   6526 | Using where              | 
|  2 | DEPENDENT SUBQUERY | c     | index | NULL          | idx_sku | 258     | NULL | 126639 | Using where; Using index | 
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)

mysql> explain SELECT nid FROM uc_products WHERE model NOT IN ( SELECT sku  FROM uc_supplier_csv ) ;
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| id | select_type        | table           | type  | possible_keys         | key     | key_len | ref  | rows   | Extra                    |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
|  1 | PRIMARY            | uc_products     | ALL   | NULL                  | NULL    | NULL    | NULL |   6526 | Using where              | 
|  2 | DEPENDENT SUBQUERY | uc_supplier_csv | index | idx_sku,idx_sku_stock | idx_sku | 258     | NULL | 126639 | Using where; Using index | 
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)

最佳答案

也许尝试使用 NOT EXISTS 而不是计数?例如:

SELECT nid 
  FROM uc_products p
 WHERE NOT EXISTS ( 
       SELECT 1 
         FROM uc_supplier_csv c
        WHERE p.model = c.sku
       )

SO 用户 Quassnoi 有 short article概述了一些测试,表明这也值得一试:

SELECT nid 
  FROM uc_products
 WHERE model NOT IN ( 
       SELECT sku 
       FROM uc_supplier_csv
       )

基本上按照您的原始查询,没有 DISTINCTion。

克里斯,再给你一份,这次有交叉编码连接的帮助:

SELECT nid
  FROM uc_products p
 WHERE NOT EXISTS (
       SELECT 1
       FROM uc_supplier_csv c
       WHERE CONVERT( p.model USING latin1 )  = c.sku
       )

关于mysql - 高效的 MySQL 查询查找 A 中与 B 不匹配的条目,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1552585/

相关文章:

mysql - 如何通过复制磁盘文件*.frm和*.ibd来克隆InnoDB表?

mysql - 多次加入来收集用户评分?

database - Hash Join 需要全表扫描

mysql - Laravel 查询 where 子查询

mysql - 按具有 has_many 关联的连接表列分组

mongodb加入 meteor

sql - 选择所有不在 postgres 表中的整数

mysql - 优化Mysql查询Left Join

MySQL 左外连接验证?

php - mysqli_free_result() : Object of class mysqli_result could not be converted to string