我正在从 ElasticSearch v1.0.0 迁移到 v7.13.1。我知道 ElasticSearch 7.0.0 以后的版本已经删除了对 Type 规范的支持。 此外,ElasticSearch 在类方面也做了一些改进,例如TermsAggregationBuilder 取代了TermsBuilder。
但是当我使用 QueryBuilders 和 AggregationBuilder 准备查询时,我可能会看到生成了一些我不想要的额外字段。
有什么方法可以通过编程来避免它们吗?
之前
private TermsBuilder createAggreationsUriDetails() {
return AggregationBuilders
.terms(xxxxxxxx)...
之后
private TermsAggregationBuilder createAggreationsUriDetails() {
return AggregationBuilders
.terms(ElasticConstants.URI)...
我还使用 matchQuery() 来准备升级后的 ES 版本的匹配查询。我仍然可以看到一些额外的字段。订单也是如此。
新旧elasticsearch版本查询对比
之前
{
"size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"uri.raw": {
"query": "sample_uri",
"type": "boolean"
}
}
},
{
"range": {
"@timestamp": {
"from": 1655145000000,
"to": 1655231400000,
"include_lower": "true",
"include_upper": "false",
"format": "epoch_millis"
}
}
}
]
}
},
"aggs": {
"uri": {
"terms": {
"field": "uri.raw",
"size": 1,
"order": {
"_count": "desc"
}
},
"aggregations": {
"client_id": {
"terms": {
"field": "client_id",
"size": 10000,
"order": {
"_count": "desc"
}
},
"aggregations": {
"response_code": {
"terms": {
"field": "response_code.raw",
"size": 8,
"order": {
"_count": "desc"
}
},
"aggregations": {
"datetime": {
"date_histogram": {
"field": "@timestamp",
"interval": "1m",
"min_doc_count": 1
}
}
}
}
}
}
}
}
}
}
使用新的 ES 版本 QueryBuilder 开发的查询
{
"size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"uri.raw": {
"query": "sample_url",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": "true",
"lenient": "false",
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": "true",
"boost": 1
}
}
},
{
"range": {
"@timestamp": {
"from": 1655145000000,
"to": 1655231400000,
"include_lower": "true",
"include_upper": "false",
"format": "epoch_millis",
"boost": 1
}
}
}
],
"adjust_pure_negative": "true",
"boost": 1
}
},
"aggs": {
"uri": {
"terms": {
"field": "uri.raw",
"size": 1,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": "false",
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"client_id": {
"terms": {
"field": "client_id",
"size": 10000,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": "false",
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"response_code": {
"terms": {
"field": "response_code.raw",
"size": 8,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": "false",
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"datetime": {
"date_histogram": {
"field": "@timestamp",
"interval": "60000ms",
"offset": 0,
"order": {
"_key": "asc"
},
"keyed": "false",
"min_doc_count": 1
}
}
}
}
}
}
}
}
}
}
最佳答案
您看到的额外字段实际上是查询的参数,例如 7.X 中的 match
查询如下所示:
"match": {
"uri.raw": {
"query": "sample_url",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": "true",
"lenient": "false",
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": "true",
"boost": 1
}
}
这些operator
、prefix_length
、lenient
都是match
查询的参数,即使你不这样做如果不提供,则会添加默认值,当您在没有这些参数的情况下以 JSON 格式进行查询时,这些将在 Elasticsearch 端添加,所以不用担心它们,如果您愿意,可以将其中一些参数值更改为查看对查询结果的相应影响,例如将 operator
更改为 AND
,多词条的搜索结果数量将会减少。
注意:您还可以查看MatchQueryBuilder的代码在 Elasticsearch 代码库中了解它们正在使用构建器设计模式,以及它们如何传递参数的默认值。
希望这有帮助。
关于java - 如何使用TermsAggregationBuilder和QueryBuilders禁用ElasticSearch查询中默认字段的自动生成,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72640809/