如何从hive EXPLAIN
中看出是否有全表扫描?
比如有没有全盘扫描? 表格大小为 993 行。
查询是
explain select latitude,longitude FROM CRIMES WHERE geohash='dp3twhjuyutr'
我在 geohash
列上有二级索引。
STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: crimes
filterExpr: (geohash = 'dp3twhjuyutr') (type: boolean)
Statistics: Num rows: 993 Data size: 265582 Basic stats: COMPLETE Column stats: NONE
Filter Operator
predicate: (geohash = 'dp3twhjuyutr') (type: boolean)
Statistics: Num rows: 496 Data size: 132657 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: latitude (type: double), longitude (type: double)
outputColumnNames: _col0, _col1
Statistics: Num rows: 496 Data size: 132657 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 496 Data size: 132657 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
最佳答案
- 计划中没有分区谓词意味着全扫描。当然,这与 ORC 中的谓词下推无关。
- 检查每个运算符中的数据大小和行数。
-
EXPLAIN DEPENDENCY
command将显示所有input_partitions
集合,您可以检查将扫描的内容。
关于hive explain plan 哪里看全表扫描?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56294254/