elasticsearch - 根据可能不存在的日期字段对查询进行排序和查询

要求:我想对可能不存在的date field执行查询并排序。记录日期字段不存在，应该首先全部包含，然后date field值小于1600230168278的记录才包含在其中。顺序将首先是不存在的记录date field，然后是date ascending 映射和样本数据:

PUT my_index
{
  "mappings": {
    "_doc": {
      "properties": {
        "date": {
          "type": "date"
        },
        "name": {
          "type": "text"
        }
      }
    }
  }
}

PUT my_index/_doc/1
{
  "date": 1546300800000
} 

PUT my_index/_doc/2
{
  "date": 1577836800000
} 

PUT my_index/_doc/3
{
  "date": 1609459200000
} 

PUT my_index/_doc/4
{
  "name": "Arif Mahmud Rana"
}

我的查询:

{
  "query": {
    "bool": {
      "must": {
        "function_score": {
          "functions": [
            {
              "filter": {
                "exists": {
                  "field": "date"
                }
              },
              "weight": 0.5
            }
          ],
          "query": {
            "match_all": {}
          }
        }
      },
      "filter": {
        "bool": {
          "minimum_should_match": 1,
          "should": [
            {
              "bool": {
                "must": [
                  {
                    "exists": {
                      "field": "date"
                    }
                  },
                  {
                    "range": {
                      "date": {
                        "lt": 1600230168278
                      }
                    }
                  }
                ]
              }
            },
            {
              "bool": {
                "must_not": {
                  "exists": {
                    "field": "date"
                  }
                }
              }
            }
          ]
        }
      }
    }
  },
  "sort": [
    {
      "_score": "desc"
    },
    {
      "date": "asc"
    }
  ],
  "size": 100
}

查询的结果:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "name" : "Arif Mahmud Rana"
        },
        "sort" : [
          1.0,
          9223372036854775807
        ]
      },
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.5,
        "_source" : {
          "date" : 1546300800000
        },
        "sort" : [
          0.5,
          1546300800000
        ]
      },
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.5,
        "_source" : {
          "date" : 1577836800000
        },
        "sort" : [
          0.5,
          1577836800000
        ]
      }
    ]
  }
}

对于具有较少数据的简单索引来说，这很好用，但是当处理较大索引时，我的 flex 节点将关闭。
flex 版本:6.8.5
实际索引:3048140(docs.count)，1073559(docs.deleted)，1.3gb(store.size)和1.3gb(pri.store.size)
任何帮助或想法都将是很好的TIA。

最佳答案

我相信所有没有大日期索引字段中的文档的自定义评分会导致问题。
这是一种可以使用missing为缺少排序字段的文档定义排序条件的方法来实现用例。

GET test/_search
{"query":{"match_all":{}}}

PUT /test
{
    "mappings": {
      
            "properties": {
               
                "name": {
                    "type": "keyword"
                },
                "age": { "type": "integer" }
            }
        
    }
}

POST test/_doc
{
  "name": "shahin",
  "age": 234
}


POST test/_doc
{
  "name": "karim",
  "age": 235
}


POST test/_doc
{
  "name": "rahim"
}

POST test/_search
{
  "query": {
        "bool": {
          "should": [
            {
              "bool": {
                "must": 
                  {
                    "range": {
                      "age": {
                        "lt": 250
                      }
                    }
                  }
              }
            },
            {
              "bool": {
                "must_not": {
                  "exists": {
                    "field": "age"
                  }
                }
              }
            }
          ]
        }
      },
  "sort": [
    { "age" : {"missing" : "_first", "order": "asc"}}
  ],
  "size": 100
}

关于elasticsearch - 根据可能不存在的日期字段对查询进行排序和查询，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/63913506/

elasticsearch - 根据可能不存在的日期字段对查询进行排序和查询

上一篇：python - Elasticsearch | update_by_query |在多值字段中保持唯一性

下一篇：audio - 如何判断视频是否有声音？