elasticsearch - Elasticsearch字段的汇总和计数聚合

标签 elasticsearch elasticsearch-plugin elasticsearch-5

我是Elasticsearch的新手,我希望对Elasticsearch 5.x索引中的字段执行某些聚合。我有一个索引,其中包含字段langs(具有嵌套结构)和docLang的文档。这些是动态映射的字段。以下是示例文档

DOC 1:

{
   "_index":"A",
   "_type":"document",
   "_id":"1",
   "_source":{
      "text":"This is a test sentence.",
      "langs":{
         "X":{
            "en":1,
            "es":2,
            "zh":3
         },
        "Y":{
            "en":4,
            "es":5,
            "zh":6
         } 
      },
      "docLang": "en"
   }
}

DOC 2:
{
   "_index":"A",
   "_type":"document",
   "_id":"2",
   "_source":{
      "text":"This is a test sentence.",
      "langs":{
         "X":{
            "en":1,
            "es":2
         },
         "Y":{
            "en":3,
            "es":4
         } 
      },
      "docLang": "es"
   }
}

DOC 3:
{
   "_index":"A",
   "_type":"document",
   "_id":"2",
   "_source":{
      "text":"This is a test sentence.",
      "langs":{
         "X":{
            "en":1
         },
         "Y":{
            "en":2
         } 
      },
      "docLang": "en"
   }
}

我想以一种对每种键(X / Y)和每种语言都可以在索引中的所有文档中求和的方式,对langs字段执行求和汇总。另外,我想从docLang字段中为每种语言生成文档计数。

例如:对于上述3个文档,在langs字段上的汇总汇总如下所示:
"langs":{  
      "X":{  
         "en":3,
         "es":4,
         "zh":3
      },
      "Y":{  
         "en":9,
         "es":9,
         "zh":6
      }
   }
docLang计数如下所示:
 "docLang":{
    "en" : 2,
    "es" : 1
   }

同样由于生产环境的某些限制,我无法在Elasticsearch中使用脚本。因此,我想知道是否可以仅将field聚合类型用于上述字段?

最佳答案

{
  "size": 0,
  "aggs": {
    "X": {
      "nested": {
        "path": "langs.X"
      },
      "aggs": {
        "X_sum_en": {
          "sum": {
            "field": "langs.X.en"
          }
        },
        "X_sum_es": {
          "sum": {
            "field": "langs.X.es"
          }
        },
        "X_sum_zh": {
          "sum": {
            "field": "langs.X.zh"
          }
        }
      }
    },
    "Y": {
      "nested": {
        "path": "langs.Y"
      },
      "aggs": {
        "Y_sum_en": {
          "sum": {
            "field": "langs.Y.en"
          }
        },
        "Y_sum_es": {
          "sum": {
            "field": "langs.Y.es"
          }
        },
        "Y_sum_zh": {
          "sum": {
            "field": "langs.Y.zh"
          }
        }
      }
    },
    "sum_docLang": {
      "terms": {
        "field": "docLang.keyword",
        "size": 10
      }
    }
  }
}

既然您没有提及,但我认为这很重要。我将XY设置为nested字段:
    "langs": {
      "properties": {
        "X": {
          "type": "nested",
          "properties": {
            "en": {
              "type": "long"
            },
            "es": {
              "type": "long"
            },
            "zh": {
              "type": "long"
            }
          }
        },
        "Y": {
          "type": "nested",
          "properties": {
            "en": {
              "type": "long"
            },
            "es": {
              "type": "long"
            },
            "zh": {
              "type": "long"
            }
          }
        }
      }
    }

但是,如果您的字段根本不是nested,在这里我的意思是实际上是Elasticsearch中的nested字段类型,像这样的简单聚合就足够了:
{
  "size": 0,
  "aggs": {
    "X_sum_en": {
      "sum": {
        "field": "langs.X.en"
      }
    },
    "X_sum_es": {
      "sum": {
        "field": "langs.X.es"
      }
    },
    "X_sum_zh": {
      "sum": {
        "field": "langs.X.zh"
      }
    },
    "Y_sum_en": {
      "sum": {
        "field": "langs.Y.en"
      }
    },
    "Y_sum_es": {
      "sum": {
        "field": "langs.Y.es"
      }
    },
    "Y_sum_zh": {
      "sum": {
        "field": "langs.Y.zh"
      }
    },
    "sum_docLang": {
      "terms": {
        "field": "docLang.keyword",
        "size": 10
      }
    }
  }
}

关于elasticsearch - Elasticsearch字段的汇总和计数聚合,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48309246/

相关文章:

elasticsearch - 将子聚合添加到bucket_selector聚合中

java - 如何使用 spring-data-elasticsearch 中的聚合获取elasticsearch json响应?

elasticsearch - 查询 ElasticSearch - 在不同时间匹配的多个术语

elasticsearch - 升级到Elasticsearch v7后,查询返回所有结果

Elasticsearch:为满足特定条件的模板创建别名

c# - Missing()方法DateHistogramAggregationDescriptor在特定条件下似乎不起作用

python - 模糊正则表达式 Elasticsearch

elasticsearch - 如何在 Elasticsearch 中为精确搜索提供比语音搜索更高的分数?

elasticsearch - 引用不同索引中的对象-Elasticsearch

elasticsearch - 如果Logstash未发送数据,则设置警报的任何方法