elasticsearch - 在Elasticsearch中使用非嵌套映射过滤聚合 key

标签 elasticsearch elasticsearch-aggregation elasticsearch-dsl

我有以下映射:

{
  "Country": {
    "properties": {
      "State": {
        "properties": {
          "Name": {
            "type": "text",
            "fields": {
              "raw": {
                "type": "keyword"
              }
            }
          },
          "Code": {
            "type": "text",
            "fields": {
              "raw": {
                "type": "keyword"
              }
            }
          },
          "Lang": {
            "type": "text",
            "fields": {
              "raw": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}
这是样本文件:
{
  "Country": {
    "State": [
      {
        "Name": "California",
        "Code": "CA",
        "Lang": "EN"
      },
      {
        "Name": "Alaska",
        "Code": "AK",
        "Lang": "EN"
      },
      {
        "Name": "Texas",
        "Code": "TX",
        "Lang": "EN"
      }
    ]
  }
}
我正在查询此索引以按名称获取状态计数的汇总。我正在使用以下查询:
{
  "from": 0,
  "size": 0,
  "query": {
    "query_string": {
      "query": "Country.State.Name: *Ala*"
    }
  },
  "aggs": {
    "counts": {
      "terms": {
        "field": "Country.State.Name.raw",
        "include": ".*Ala.*"
      }
    }
  }
}
我在术语聚合中只能使用include regex获得与query_string匹配的键,但是似乎没有办法在include中使其不区分大小写。
我想要的结果是:
{
  "aggregations": {
    "counts": {
      "buckets": [
        {
          "key": "Alaska",
          "doc_count": 1
        }
      ]
    }
  }
}
是否有其他解决方案可让我只使用匹配query_string的键而不使用嵌套映射?

最佳答案

使用Normalizer作为关键字数据类型。下面是示例映射:
对应:

PUT country
{
  "settings": {
    "analysis": {
      "normalizer": {
        "my_normalizer": {                              <---- Note this
          "type": "custom",
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "Country": {
        "properties": {
          "State": {
            "properties": {
              "Name": {
                "type": "text",
                "fields": {
                  "raw": {
                    "type": "keyword",
                    "normalizer": "my_normalizer"      <---- Note this
                  }
                }
              },
              "Code": {
                "type": "text",
                "fields": {
                  "raw": {
                    "type": "keyword",
                    "normalizer": "my_normalizer"
                  }
                }
              },
              "Lang": {
                "type": "text",
                "fields": {
                  "raw": {
                    "type": "keyword",
                    "normalizer": "my_normalizer"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}
文件:
POST country/_doc/1
{
  "Country": {
    "State": [
      {
        "Name": "California",
        "Code": "CA",
        "Lang": "EN"
      },
      {
        "Name": "Alaska",
        "Code": "AK",
        "Lang": "EN"
      },
      {
        "Name": "Texas",
        "Code": "TX",
        "Lang": "EN"
      }
    ]
  }
}
汇总查询:
POST country/_search
{
  "from": 0,
  "size": 0,
  "query": {
    "query_string": {
      "query": "Country.State.Name: *Ala*"
    }
  },
  "aggs": {
    "counts": {
      "terms": {
        "field": "Country.State.Name.raw",
        "include": "ala.*"
      }
    }
  }
}
注意include中的查询模式。基本上,由于我已经应用了标准化程序,因此您拥有的*.raw字段的所有值都将存储在lowercase letters中。
响应:
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "counts" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "alaska",
          "doc_count" : 1
        }
      ]
    }
  }
}
希望这可以帮助!

关于elasticsearch - 在Elasticsearch中使用非嵌套映射过滤聚合 key ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62625501/

相关文章:

angular - Angular7 中的 ElasticSearch Post 调用抛出 400 错误

python - 在同一环境中使用不同版本的python包

hadoop 和 elasticsearch 集成

elasticsearch - Elasticsearch 中的聚合计数/总和

elasticsearch - 如何在Elasticsearch中实现此sql查询结果

elasticsearch - 重新启动Elastic Search时,为什么分析仪会消失?

elasticsearch - 文档计数大于 'x',geo_point聚合elasticsearch

elasticsearch - 使用弹性聚合(NEST)时为null_pointer_exception

elasticsearch - ElasticSearch RANGE查询无法按预期工作

python - Elasticsearch-dsl 嵌套查询