search - Elasticsearch query_string与match_phrase结合

标签 search elasticsearch lucene full-text-search

我认为最好是描述自己的意图并尝试将其分解为代码。

  • 如果用户选择query_string提供的功能,我希望用户能够进行复杂的查询。例如“AND”,“OR”和“〜”等。
  • 我想有效地进行模糊处理,这使我无法处理发送到ES的“#{query}〜”之类的事情,换句话说,我代表用户指定模糊查询,因为我们提供音译可能很难获得确切的拼写。
  • 有时,用户会搜索许多短语中的单词。 query_string单独搜索它们,而不是短语。例如,“谁愿意”应该把这三个词按顺序排列给我,这是我的最佳选择,然后再给我任何东西。

  • 当前查询:
    {
      "indices_boost": {},
      "aggregations": {
        "by_ayah_key": {
          "terms": {
            "field": "ayah.ayah_key",
            "size": 6236,
            "order": {
              "average_score": "desc"
            }
          },
          "aggregations": {
            "match": {
              "top_hits": {
                "highlight": {
                  "fields": {
                    "text": {
                      "type": "fvh",
                      "matched_fields": [
                        "text.root",
                        "text.stem_clean",
                        "text.lemma_clean",
                        "text.stemmed",
                        "text"
                      ],
                      "number_of_fragments": 0
                    }
                  },
                  "tags_schema": "styled"
                },
                "sort": [
                  {
                    "_score": {
                      "order": "desc"
                    }
                  }
                ],
                "_source": {
                  "include": [
                    "text",
                    "resource.*",
                    "language.*"
                  ]
                },
                "size": 5
              }
            },
            "average_score": {
              "avg": {
                "script": "_score"
              }
            }
          }
        }
      },
      "from": 0,
      "size": 0,
      "_source": [
        "text",
        "resource.*",
        "language.*"
      ],
      "query": {
        "bool": {
          "must": [
            {
              "query_string": {
                "query": "inna alatheena",
                "fuzziness": 1,
                "fields": [
                  "text^1.6",
                  "text.stemmed"
                ],
                "minimum_should_match": "85%"
              }
            }
          ],
          "should": [
              {
                "match": {
                    "text": {
                        "query": "inna alatheena",
                        "type": "phrase"
                    }
                }
            }
            ]
        }
      }
    }
    

    注意:尽管我在索引中有alatheena,但是没有~进行搜索的allatheena不会返回任何内容。因此,我必须进行模糊搜索。

    有什么想法吗?

    最佳答案

    您应该使用Dis Max Query来实现。

    A query that generates the union of documents produced by its subqueries, and that scores each document with the maximum score for that document as produced by any subquery, plus a tie breaking increment for any additional matching subqueries.

    This is useful when searching for a word in multiple fields with different boost factors (so that the fields cannot be combined equivalently into a single search field). We want the primary score to be the one associated with the highest boost.



    快速示例如何使用它:
    POST /_search
    {
      "query": {
        "dis_max": {
          "tie_breaker": 0.7,
          "boost": 1.2,
          "queries": [
            {
              "match": {
                "text": {
                  "query": "inna alatheena",
                  "type": "phrase",
                  "boost": 5
                }
              }
            },
            {
              "match": {
                "text": {
                  "query": "inna alatheena",
                  "type": "phrase",
                  "fuzziness": "AUTO",
                  "boost": 3
                }
              }
            },
            {
              "query_string": {
                "default_field": "text",
                "query": "inna alatheena"
              }
            }
          ]
        }
      }
    }
    

    它将运行您的所有查询,并且将获得得分最高的查询。因此,只需使用它定义规则即可。您应该实现您想要的。

    关于search - Elasticsearch query_string与match_phrase结合,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33580698/

    相关文章:

    java - Java 1.7 对 Solr 5.2.1 的影响

    lucene - Sitecore 7 Lucene : strip HTML from computed field

    search - 在 Notepad++ 中创建自定义搜索按钮

    java - 根据2d距离从java hashmap获取

    正则表达式浏览器搜索?

    java - BinarySearch 与 For 循环

    Elasticsearch 到 Spark Streaming

    java - 如何诊断 ElasticSearch 搜索队列增长

    regex - Elasticsearch 脚本正则表达式聚合

    lucene - ElasticSearch:根据字段长度过滤文档