浏览器跟踪器的 mySQL 查询优化

我一直在这个网站上阅读很多关于不同问题的很好的答案，但这是我第一次发帖。因此，提前感谢您的帮助。

这是我的问题:

我有一个 MySQL 表，用于跟踪我们拥有的不同网站的访问情况。这是表结构:

    create table navigation_base (
          uid int(11) NOT NULL,
          date datetime not null,
          dia date not null,
          ip int(4) unsigned not null default 0,
          session_id int unsigned not null,
          cliente smallint unsigned not null default 0,
          campaign mediumint unsigned not null default 0,
          trackcookie int unsigned not null,
          adgroup int unsigned not null default 0,
          PRIMARY KEY (uid)
     ) ENGINE=MyISAM;

这张 table 大约有。 7000 万行(平均每天 110,000 行)。

在该表上，我们使用以下命令创建了索引:

alter table navigation_base add index dia_cliente_campaign_ip (dia,cliente,campaign,ip);
alter table navigation_base add index dia_cliente_campaign_ip_session (dia,cliente,campaign,ip,session_id);
alter table navigation_base add index dia_cliente_campaign_ip_session_trackcookie (dia,cliente,campaign,ip,session_id,trackcookie);

然后，我们使用此表通过以下查询获取按客户、日期和事件分组的访问者统计信息:

select 
  dia,
  navigation_base.campaign,
  navigation_base.cliente,
  count(distinct ip) as visitas,
  count(ip) as paginas_vistas,
  count(distinct session_id) as sesiones,
  count(distinct trackcookie) as cookies 
from navigation_base where 
  (dia between '2017-01-01' and '2017-01-31') 
  group by dia,cliente,campaign order by NULL

即使创建了这些索引，一个月的响应时间也相对较慢；在我们的服务器上大约 3 秒。

是否有一些方法可以加快这些查询的速度？

提前致谢。

最佳答案

对于如此多的数据，单独建立索引可能并没有多大帮助，因为数据中有很多相似之处。此外，您还有 GROUP BY 和 SORT 以及聚合。所有这些结合起来使得优化非常困难。 partitioning是前进的方向，因为:

Some queries can be greatly optimized in virtue of the fact that data satisfying a given WHERE clause can be stored only on one or more partitions, which automatically excludes any remaining partitions from the search. Because partitions can be altered after a partitioned table has been created, you can reorganize your data to enhance frequent queries that may not have been often used when the partitioning scheme was first set up.

如果这对您不起作用，您仍然可以

In addition, MySQL 5.7 supports explicit partition selection for queries. For example, SELECT * FROM t PARTITION (p0,p1) WHERE c < 5 selects only those rows in partitions p0 and p1 that match the WHERE condition.

ALTER TABLE navigation_base
        PARTITION BY RANGE( TO_DAYS(dia)) (
        PARTITION p0 VALUES LESS THAN (TO_DAYS('2018-12-31')),
        PARTITION p1 VALUES LESS THAN (TO_DAYS('2017-12-31')),
        PARTITION p2 VALUES LESS THAN (TO_DAYS('2016-12-31')),
        PARTITION p3 VALUES LESS THAN (TO_DAYS('2015-12-31')),
        ..
        PARTITION p10 VALUES LESS THAN MAXVALUE));

根据需要使用更大或更小的分区。

要牢记的最重要因素是mysql 只能对每个表使用一个索引。因此请明智地选择索引。

关于浏览器跟踪器的 mySQL 查询优化，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42486376/

浏览器跟踪器的 mySQL 查询优化

上一篇：mysql - sum() 使用 join 返回不正确的值

下一篇：mysql - 在另一列mysql下分配列