python - 如何使用elasticsearch-dsl查找数组中所有索引中的不同值?

标签 python elasticsearch elasticsearch-dsl

我在 django 中使用elasticsearch-dsl。我定义了一个 DocType 文档和一个包含值列表的关键字。

这是我的代码。

from elasticsearch_dsl import DocType, Text, Keyword

class ProductIndex(DocType):
    """
    Index for products
    """
    id = Keyword()
    slug = Keyword()
    name = Text()
    filter_list = Keyword()

filter_list 是这里包含多个值的数组。现在我有一些值,例如sample_filter_list,它们是不同的值,其中一些元素可以存在于某些产品的filter_list 数组中。因此,给定这个sample_filter_list,我想要filter_list与sample_filter_list交集不为空的所有产品的filter_list的所有唯一元素。

for example I have 5 products whose filter_list is like :
1) ['a', 'b', 'c']
2) ['d', 'e', 'f']
3) ['g', 'h', 'i']
4) ['j', 'k', 'l']
5) ['m', 'n', 'o']
and if my sample filter_list is ['a', 'd', 'g', 'j', 'm']
then elasticsearch should return an array containg distinct element 
i.e. ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o']

最佳答案

            Writing Answer not specific to django but general,
            Suppose you have some ES index some_index2 with mapping

            PUT some_index2
            {
              "mappings": {
                "some_type": {
                  "dynamic_templates": [
                    {
                      "strings": {
                        "mapping": {
                          "type": "string"
                        },
                        "match_mapping_type": "string"
                      }
                    }
                  ],
                  "properties": {
                    "field1": {
                      "type": "string"
                    },
                    "field2": {
                      "type": "string"
                    }
                  }
                }
              }
            }

        Also you have inserted the documents 
        {
            "field1":"id1",
            "field2":["a","b","c","d]
        }
        {
            "field1":"id2",
            "field2":["e","f","g"]
        }
        {
            "field1":"id3",
            "field2":["e","l","k"]
        }

    Now as you stated you want all the distinct values of field2(filter_list) in your case, You can easily get that by using ElasticSearch term aggregation

    GET some_index2/_search
    {
    "aggs": {
      "some_name": {
        "terms": {
          "field": "field2",
          "size": 10000
        }
      }
    },
    "size": 0
    }

    Which will give you result as:

    {
      "took": 2,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
      },
      "hits": {
        "total": 3,
        "max_score": 0,
        "hits": []
      },
      "aggregations": {
        "some_name": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "e",
              "doc_count": 2
            },
            {
              "key": "a",
              "doc_count": 1
            },
            {
              "key": "b",
              "doc_count": 1
            },
            {
              "key": "c",
              "doc_count": 1
            },
            {
              "key": "d",
              "doc_count": 1
            },
            {
              "key": "f",
              "doc_count": 1
            },
            {
              "key": "g",
              "doc_count": 1
            },
            {
              "key": "k",
              "doc_count": 1
            },
            {
              "key": "l",
              "doc_count": 1
            }
          ]
        }
      }
    }

    where buckets contains the list of all the distinct values.
    you can easily iterate through bucket and find the value under KEY.

Hope this is what is required to you.

关于python - 如何使用elasticsearch-dsl查找数组中所有索引中的不同值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51056111/

相关文章:

python - Python 中的 replace() 方法有什么特别之处?

amazon-web-services - 无法通过aws公共(public)IP连接到Elasticsearch

python - 如何配置Elasticsearch从Amazon SQS提取消息

python - 弹性云-无法创建索引

elasticsearch - 带有子句转换为ES语法的ElasticSearch Lucene查询

elasticsearch - 如何在Elasticsearch中获得嵌套字段的不同值?

python - 上传 django rest framework api 时文件(pdf 除外)损坏

python - OpenCV 找不到但可以导入

python - 有没有办法在python中使用字典输入一个字符串并使用键输出另一个字符串?

elasticsearch - Elasticsearch 平均时差聚合查询