mysql - 选择所有行 (700000) 很长时间 - 小时

我使用mysql mariadb(服务器版本:10.3.20-MariaDB-1:10.3.20+maria~stretch mariadb.org 二进制发行版)。

我有大约 700 000 条包含列的记录:

ID
html(中文本)字段平均长度非常大:~150000
日期
+2 个小其他

在 html 中，我有很长的文本(它是 html 的)。现在我需要 select * from table; 来分析这个 html，但是这个查询每个查询占用约 0.03819 秒(我在较小的部分上进行了测试)，所以:每个查询的总行数 700000*0.03819 秒 = (700000 *0.03819s)/60/60 = 超过 7 个小时的选择!

我有 8 个核心和 60GB RAM。分析查询显示传输数据的时间非常非常长。如何加快速度？有可能吗，或者这么多数据对于mysql来说太多了，我需要mongodb？

query_cache_limit = 64M 
query_cache_size = 1024M 
max_allowed_packet = 64M
net_buffer_length = 16384
max_connect_errors = 1000
thread_concurrency = 32
concurrent_insert = 2
read_rnd_buffer_size = 8M
bulk_insert_buffer_size = 8M
query_cache_limit = 64M
query_cache_size = 1024M
query_cache_type = 1
query_prealloc_size = 262144
query_alloc_block_size = 65536
transaction_alloc_block_size = 8192
transaction_prealloc_size = 4096
max_write_lock_count = 16
innodb_buffer_pool_size=30G
innodb_flush_log_at_trx_commit=2
innodb_thread_concurrency=16
innodb_flush_method=O_DIRECT
innodb_read_io_threads = 64
innodb_write_io_threads = 16
innodb_buffer_pool_instances = 20

MariaDB [db]> explain select id, href, html from raw limit 10;
+------+-------------+-------+------+---------------+------+---------+------+--------+-------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows   | Extra |
+------+-------------+-------+------+---------------+------+---------+------+--------+-------+
|    1 | SIMPLE      | raw   | ALL  | NULL          | NULL | NULL    | NULL | 658793 |       |
+------+-------------+-------+------+---------------+------+---------+------+--------+-------+
1 row in set (0.227 sec)

使用索引后:

MariaDB [db]> show index from raw;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| raw   |          0 | PRIMARY  |            1 | id          | A         |      658793 |     NULL | NULL   |      | BTREE      |         |               |
| raw   |          1 | id       |            1 | id          | A         |      658793 |     NULL | NULL   |      | BTREE      |         |               |
| raw   |          1 | href     |            1 | href        | A         |      658793 |     NULL | NULL   | YES  | BTREE      |         |               |
| raw   |          1 | date     |            1 | date        | A         |      131758 |     NULL | NULL   | YES  | BTREE      |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (3.724 sec)

最佳答案

38ms 从旋转磁盘获取 150Kb 数据相当快。

query_cache_size = 1024M -- 这太高了。停在大约50M处。

PRIMARY KEY 是唯一索引。因此，如果 id 是主键，则不要同时说 KEY(id)。

It's is possible, or that much of data it's too much for mysql and I need mongodb?

假设您以磁盘速度运行，则不能指望任何其他产品运行得更快。

客户端将如何处理一批 100GB 的数据？ MySQL 会很乐意提供它，但客户端可能会窒息。

关于mysql - 选择所有行 (700000) 很长时间 - 小时，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59221986/

mysql - 选择所有行 (700000) 很长时间 - 小时

上一篇：php - Laravel View Composer，似乎无法让它工作

下一篇：php - 与 Eloquent 的 3 方枢轴连接