python - ElasticSearch建议者全文搜索

标签 python django elasticsearch elasticsearch-dsl

我正在使用django_elasticsearch_dsl。
我的文件:

html_strip = analyzer(
    'html_strip',
    tokenizer='standard',
    filter=["lowercase", "stop", "snowball"],
    char_filter=["html_strip"]
)

class Document(django_elasticsearch_dsl.Document):
    name = TextField(
        analyzer=html_strip,
        fields={
            'raw': fields.KeywordField(),
            'suggest': fields.CompletionField(),
        }
    )
    ...
我的请求:
_search = Document.search().suggest("suggestions", text=query, completion={'field': 'name.suggest'}).execute()
我将以下文档“名称”编入索引:
"This is a test"
"this is my test"
"this test"
"Test this"
现在,如果搜索This is my text只会收到
"this is my text"
但是,如果我搜索test,那么我得到的只是
"Test this"
即使我想要所有文档,它们的名称中都带有test
我想念什么?

最佳答案

Based on the comment given by the user, adding another answer using ngrams


添加带有索引映射,索引数据,搜索查询和搜索结果的工作示例
索引映射:
{
  "settings": {
    "analysis": {
      "filter": {
        "ngram_filter": {
          "type": "ngram",
          "min_gram": 4,
          "max_gram": 20
        }
      },
      "analyzer": {
        "ngram_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "ngram_filter"
          ]
        }
      }
    },
    "max_ngram_diff": 50
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "ngram_analyzer",
        "search_analyzer": "standard"
      }
    }
  }
}
索引数据:
{
  "name": [
    "Test this"
  ]
}

{
  "name": [
    "This is a test"
  ]
}

{
  "name": [
    "this is my test"
  ]
}

{
  "name": [
    "this test"
  ]
}
分析API:
POST/_analyze

{
  "analyzer" : "ngram_analyzer",
  "text" : "this is my test"
}
生成以下 token :
{
  "tokens": [
    {
      "token": "this",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "test",
      "start_offset": 11,
      "end_offset": 15,
      "type": "<ALPHANUM>",
      "position": 3
    }
  ]
}
搜索查询:
{
    "query": {
        "match": {
           "name": "test"
        }
    }
}
搜索结果:
"hits": [
      {
        "_index": "stof_64281341",
        "_type": "_doc",
        "_id": "4",
        "_score": 0.2876821,
        "_source": {
          "name": [
            "Test this"
          ]
        }
      },
      {
        "_index": "stof_64281341",
        "_type": "_doc",
        "_id": "3",
        "_score": 0.2876821,
        "_source": {
          "name": [
            "this is my test"
          ]
        }
      },
      {
        "_index": "stof_64281341",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.2876821,
        "_source": {
          "name": [
            "This is a test"
          ]
        }
      },
      {
        "_index": "stof_64281341",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "name": [
            "this test"
          ]
        }
      }
    ]
对于模糊搜索,您可以使用以下搜索查询:
{
  "query": {
    "fuzzy": {
      "name": {
        "value": "tst"    <-- used tst in place of test
      }
    }
  }
}

关于python - ElasticSearch建议者全文搜索,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64281341/

相关文章:

python - Django save() 方法不向我的 Mysql 数据库插入数据

django - 为什么django开发的自动静态文件服务器不适合生产?

search - 同义词,将权重存储在文档中以在Elastic Search中进行相关性评分

spring-boot - nodeBuilder() 已被 Elasticsearch 删除,但 spring-data-elasticsearch 文档仍然包含使用 nodeBuilder() 的配置

python - 最小组大小的 Pandas groupby

python - 获取颜色 brewer 调色板的 cmap

python - 如何定义在django中接受各种字符串的url

python - 是否可以一起使用 Python、AJAX 和 CGI

python - 使用python显示xml中元素的内容

java - 是否可以使用 Elasticsearch 聚合按键对结果进行分组?