elasticsearch - Elasticsearch 路口查询

标签 elasticsearch elasticsearch-aggregation elasticsearch-query

我想获取按总数排序的用户列表的常用字词。

例:
我有一个用户使用的单词索引。

docs:

[
  {
    user_id: 1,
    word: 'food',
    count: 2
  },
  {
    user_id: 1,
    word: 'thor',
    count: 1
  },
  {
    user_id: 1,
    word: 'beer',
    count: 7
  },
  {
    user_id: 2,
    word: 'summer',
    count: 12
  },
  {
    user_id: 2,
    word: 'thor',
    count: 4
  },
  {
    user_id: 1,
    word: 'beer',
    count: 2
  },
  ..otheruserdetails..
]

输入:user_ids: [1, 2]
所需的输出:
[
  {
    'word': 'beer',
    'total_count': 9
  },
  {
    'word': 'thor',
    'total_count': 5
  }
]

我到目前为止所拥有的:
  • 在user_id列表中使用user_id获取所有文档( bool(boolean) 应查询)
  • 在应用程序层处理文档。
  • 遍历每个关键字
  • 检查每个user_id是否存在关键字
  • 如果是,请找到计数
  • else,配置并转到下一个关键字

  • 但是,这是不可行的,因为word文档会变得越来越大,并且应用程序层将无法跟上。任何方式将其移至ES查询?

    最佳答案

    您可以使用Terms aggregationValue Count aggregation

    可以将“术语聚合”视为“分组依据”。输出将给出一个唯一的userIds列表,该用户下所有单词的列表以及每个单词的最终计数

    {
      "from": 0, 
      "size": 10, 
      "query": {
        "terms": {
          "user_id": [
            "1",
            "2"
          ]
        }
      },
      "aggs": {
        "users": {
          "terms": {
            "field": "user_id",
            "size": 10
          },
          "aggs": {
            "words": {
              "terms": {
                "field": "word.keyword",
                "size": 10
              },
              "aggs": {
                "word_count": {
                  "value_count": {
                    "field": "word.keyword"
                  }
                }
              }
            }
          }
        }
      }
    }
    

    结果
        "hits" : [
          {
            "_index" : "index89",
            "_type" : "_doc",
            "_id" : "gFRzr3ABAWOsYG7t2tpt",
            "_score" : 1.0,
            "_source" : {
              "user_id" : 1,
              "word" : "thor",
              "count" : 1
            }
          },
          {
            "_index" : "index89",
            "_type" : "_doc",
            "_id" : "flRzr3ABAWOsYG7t0dqI",
            "_score" : 1.0,
            "_source" : {
              "user_id" : 1,
              "word" : "food",
              "count" : 2
            }
          },
          {
            "_index" : "index89",
            "_type" : "_doc",
            "_id" : "f1Rzr3ABAWOsYG7t19ps",
            "_score" : 1.0,
            "_source" : {
              "user_id" : 2,
              "word" : "thor",
              "count" : 4
            }
          },
          {
            "_index" : "index89",
            "_type" : "_doc",
            "_id" : "gVRzr3ABAWOsYG7t8NrR",
            "_score" : 1.0,
            "_source" : {
              "user_id" : 1,
              "word" : "food",
              "count" : 2
            }
          },
          {
            "_index" : "index89",
            "_type" : "_doc",
            "_id" : "glRzr3ABAWOsYG7t-Npj",
            "_score" : 1.0,
            "_source" : {
              "user_id" : 1,
              "word" : "thor",
              "count" : 1
            }
          },
          {
            "_index" : "index89",
            "_type" : "_doc",
            "_id" : "g1Rzr3ABAWOsYG7t_9po",
            "_score" : 1.0,
            "_source" : {
              "user_id" : 2,
              "word" : "thor",
              "count" : 4
            }
          }
        ]
      },
      "aggregations" : {
        "users" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [
            {
              "key" : 1,
              "doc_count" : 4,
              "words" : {
                "doc_count_error_upper_bound" : 0,
                "sum_other_doc_count" : 0,
                "buckets" : [
                  {
                    "key" : "food",
                    "doc_count" : 2,
                    "word_count" : {
                      "value" : 2
                    }
                  },
                  {
                    "key" : "thor",
                    "doc_count" : 2,
                    "word_count" : {
                      "value" : 2
                    }
                  }
                ]
              }
            },
            {
              "key" : 2,
              "doc_count" : 2,
              "words" : {
                "doc_count_error_upper_bound" : 0,
                "sum_other_doc_count" : 0,
                "buckets" : [
                  {
                    "key" : "thor",
                    "doc_count" : 2,
                    "word_count" : {
                      "value" : 2
                    }
                  }
                ]
              }
            }
          ]
        }
      }
    

    关于elasticsearch - Elasticsearch 路口查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60562308/

    相关文章:

    java - 使用 Java API 在 Elasticsearch 中创建 MongoDB River

    amazon-web-services - 在 AWS 上担任 elasticsearch 角色

    Elasticsearch 查询具有多个值的字段,一个匹配得分相等

    elasticsearch - Elasticsearch不包含嵌套值的项目汇总

    elasticsearch - 嵌套过滤器返回0 doc_count

    elasticsearch - 名称为[TypeName]的inner_hit定义已存在。使用其他的inner_hit名称

    elasticsearch - 字词聚合(以实现分层构面)查询性能较慢

    javascript - Elasticsearch 术语聚合自然排序

    elasticsearch - 在Elasticsearch中搜索有关电话号码的查询

    elasticsearch - 如何在从 API 返回之前过滤 _source?