indexing - MonetDB 带索引的慢查询

标签 indexing monetdb

我创建了一个10GB(60M记录)的表,插入数据后手动添加了唯一索引(hidden_​​id)。 我有最简单的查询,但花了一分钟才能完成。

select hidden_id from netflow where hidden_id = 350000;

而且查询也花费了数十分钟“select * from netflow order by hide_id limit 12500 offset 212500;”。 这真的让我很困惑。

我在下面发布了第一个查询的分析。知道为什么这么慢吗?

trace select hidden_id from netflow where hidden_id = 350000;

+----------+------------------------------------------------------------------+
| ticks    | stmt                                                             |
+==========+==================================================================+
|        3 | X_3 := sql.mvc();                                                |
|       15 | X_7=<tmp_2510>[69396995] := sql.bind(X_3=0,"sys","netflow","hidd |
:          : en_id",0);                                                       :
|      227 | X_4:bat[:oid,:oid] =<tmp_13332>[69396995] := sql.tid(X_3=0,"sys" |
:          : ,"netflow");                                                     :
| 72978741 | X_36=<tmp_4053>[1] := algebra.subselect(X_7=<tmp_2510>[69396995] |
:          : ,X_4=<tmp_13332>:bat[:oid,:oid][69396995],A0=350000:lng,A0=35000 :
:          : 0:lng,true,true,false);                                          :
|       17 | (X_10=<tmp_2175>[0],r1_10=<tmp_3416>[0]) := sql.bind(X_3=0,"sys" |
:          : ,"netflow","hidden_id",2);                                       :
|       14 | X_37=<tmp_13337>[0] := algebra.subselect(r1_10=<tmp_3416>[0],A0= |
:          : 350000:lng,A0=350000:lng,true,true,false);                       :
|        6 | X_13=<tmp_3416>[0] := sql.bind(X_3=0,"sys","netflow","hidden_id" |
:          : ,1);                                                             :
|       15 | X_38=<tmp_11053>[0] := algebra.subselect(X_13=<tmp_3416>[0],X_4= |
:          : <tmp_13332>:bat[:oid,:oid][69396995],A0=350000:lng,A0=350000:lng :
:          : ,true,true,false);                                               :
|        4 | X_15=<tmp_4053>[1] := sql.subdelta(X_36=<tmp_4053>[1],X_4=<tmp_1 |
:          : 3332>:bat[:oid,:oid][69396995],X_10=<tmp_2175>[0],X_37=<tmp_1333 :
:          : 7>[0],X_38=<tmp_11053>[0]);                                      :
|       20 | X_17=<tmp_11053>[1] := sql.projectdelta(X_15=<tmp_4053>[1],X_7=< |
:          : tmp_2510>[69396995],X_10=<tmp_2175>[0],r1_10=<tmp_3416>[0],X_13= :
:          : <tmp_3416>[0]);                                                  :
|        6 | X_18 := sql.resultSet(1,1,X_17=<tmp_11053>[1]);                  |
|        7 | sql.rsColumn(X_18=1,"sys.netflow","hidden_id","bigint",64,0,X_17 |
:          : =<tmp_11053>[1]);                                                :
|        2 | X_23 := io.stdout();                                             |
|       25 | sql.exportResult(X_23=="104d2":streams,X_18=1);                  |
|        1 | end s1_3;                                                        |
| 73011629 | X_5:void  := user.s1_3(350000:lng);                              |
+----------+------------------------------------------------------------------+

这是正在创建的表。

CREATE TABLE "netflow" (
    "time_seconds" double DEFAULT NULL,
    "parsed_date" timestamp DEFAULT NULL,
    "date_time_str" varchar(45) DEFAULT NULL,
    "ip_layer_protocol" bigint DEFAULT NULL,
    "ip_layer_protocol_code" varchar(45) DEFAULT NULL,
    "first_seen_src_ip" varchar(45) DEFAULT NULL,
    "first_seen_dest_ip" varchar(45) DEFAULT NULL,
    "first_seen_src_port" bigint DEFAULT NULL,
    "first_seen_dest_port" bigint DEFAULT NULL,
    "more_fragments" varchar(45) DEFAULT NULL,
    "cont_fragments" varchar(45) DEFAULT NULL,
    "duration_seconds" bigint DEFAULT NULL,
    "first_seen_src_payload_bytes" bigint DEFAULT NULL,
    "first_seen_dest_payload_bytes" bigint DEFAULT NULL,
    "first_seen_src_total_bytes" bigint DEFAULT NULL,
    "first_seen_dest_total_bytes" bigint DEFAULT NULL,
    "first_seen_src_packet_count" bigint DEFAULT NULL,
    "first_seen_dest_packet_count" bigint DEFAULT NULL,
    "record_force_out" varchar(45) DEFAULT NULL
);

更新:

平台:Windows 7 无并行

MonetDB 版本:MonetDB 5 服务器 v11.15.19“2013 年 2 月-SP6”

存储中表的描述:

select * from storage() where "table"='netflow';
+--------+---------+-------------------------------+-----------+----------+-----
-----+-----------+------------+------------+---------+--------+
| schema | table   | column                        | type      | location | count    | typewidth | columnsize | heapsize   | indices | sorted |
+========+=========+===============================+===========+==========+==========+===========+============+============+=========+========+
| sys    | netflow | time_seconds                  | double    | 17\1711  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | parsed_date                   | timestamp | 20\2054  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | date_time_str                 | varchar   | 07\734   | 69396995 |        21 |  277587980 | 2684354560 |       0 | false  |
| sys    | netflow | ip_layer_protocol             | bigint    | 62\6261  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | ip_layer_protocol_code        | varchar   | 62\6213  | 69396995 |         3 |   69396995 |     524288 |       0 | false  |
| sys    | netflow | first_seen_src_ip             | varchar   | 63\6342  | 69396995 |        11 |  138793990 |     524288 |       0 | false  |
| sys    | netflow | first_seen_dest_ip            | varchar   | 23\2324  | 69396995 |         8 |  138793990 |     524288 |       0 | false  |
| sys    | netflow | first_seen_src_port           | bigint    | 15\1574  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | first_seen_dest_port          | bigint    | 23\2370  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | more_fragments                | varchar   | 65\6521  | 69396995 |         1 |   69396995 |     524288 |       0 | false  |
| sys    | netflow | cont_fragments                | varchar   | 65\6524  | 69396995 |         1 |   69396995 |     524288 |       0 | false  |
| sys    | netflow | duration_seconds              | bigint    | 65\6560  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | first_seen_src_payload_bytes  | bigint    | 65\6561  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | first_seen_dest_payload_bytes | bigint    | 65\6562  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | first_seen_src_total_bytes    | bigint    | 65\6563  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | first_seen_dest_total_bytes   | bigint    | 65\6564  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | first_seen_src_packet_count   | bigint    | 65\6565  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | first_seen_dest_packet_count  | bigint    | 65\6566  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | record_force_out              | varchar   | 65\6567  | 69396995 |         1 |   69396995 |     524288 |       0 | false  |
| sys    | netflow | hidden_id                     | bigint    | 25\2510  | 69396995 |         8 |  555175960 |          0 |       0 | false  |
| sys    | netflow | index_id                      | oid       | 73\7375  | 69396995 |         8 |  555175960 |          0 |       0 | true   |
+--------+---------+-------------------------------+-----------+----------+----------+-----------+------------+------------+---------+--------+

最佳答案

请注意,SQL 语句 CREATE INDEX 在 MonetDB 中会被默默忽略。 它将根据需要创建必要的哈希索引。 您的第一个查询会触发此操作。

我还注意到你似乎没有并行运行。 请务必注明所使用的平台和 MonetDB 版本。

关于indexing - MonetDB 带索引的慢查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22430284/

相关文章:

mysql - 如何索引状态(标志)字段

python - numpy数组中非零值的随机索引

sql - 子查询中的 "ORDER BY"- 在 MonetDB 中不可用?

mysql - 我如何解释几个连续的查询而不执行它们?

mysql - 是否将图像存储在 SQL 中?

java - 在 Neo4j 索引中创建第 128 个节点后,无法访问更多节点

javascript - 如何找到长数组中每个元素的总和?

Mysql 在选择非索引字段时不使用索引