elasticsearch - 如何在像 “i phone”这样的Elasticsearch中匹配不匹配的单词

我用字段“名称”创建了两个索引fashion和mobiles。

client.indices.create(index='fashion',body={"mappings": {"doc": {"properties": {"name": {"type": "string"} } } } })
client.indices.create(index='mobiles',body={"mappings": {"doc": {"properties": {"name": {"type": "string"} } } } })

对于Fashion，添加了以下文档。

client.index(index='mobiles',doc_type='blog',body={"query":{ "name": "i shirts" }})
client.index(index='mobiles',doc_type='blog',body={"query":{ "name": "i celekon" }})
client.index(index='mobiles',doc_type='blog',body={"query":{ "name": "satsung" }})

对于手机:

client.index(index='mobiles',doc_type='blog',body={"query":{ "name": "apple iphone 6s" }})
client.index(index='mobiles',doc_type='blog',body={"query":{ "name": "samsung galaxy s2" }})
client.index(index='mobiles',doc_type='blog',body={"query":{ "name": "apple iphone 5s" }})

当我使用匹配查询来搜索类似

search="i phone"
test=client.search(index='mobiles,fashion',doc_type='blog',size=10,body={"query": {"bool" : {"should" : [{"match": {"name": {"query":search,"slop": 10,"max_expansions": 2 }}},{"match_phrase_prefix": {"name": {"query":search,"slop": 10,"max_expansions": 2}}},{"match": {"name": {"query":search, "fuzziness":1}}}]}}})

我按以下顺序获得结果。

i shirts , i celekon , apple iphone 6s , apple iphone 5s

如何追踪结果？

apple iphone 6s , apple iphone 5s, ....

“amazon”，“flipkart”如何实现这些类型的搜索？

注意:我使用elasticsearch-py api进行搜索。

最佳答案

您必须创建一个使用Word Delimiter Token Filter的自定义分析器:

Named word_delimiter, it splits words into subwords and performs optional transformations on subword groups. Words are split into subwords with the following rules:

split on intra-word delimiters (by default, all non alpha-numeric
characters). "Wi-Fi" → "Wi", "Fi"

split on case transitions: "PowerShot" → "Power", "Shot"

split on letter-number transitions: "SD500" → "SD", "500"

leading and trailing intra-word delimiters on each subword are ignored: "//hello---there, dude" → "hello", "there", "dude"

trailing "'s" are removed for each subword: "O’Neil’s" → "O", "Neil"

我认为您正在寻找第二个例子。如果您要为iPhone编制索引，它将创建 token "i"和"Phone"，这正是您要寻找的。

要记住的一件事是，您应该在此处照看"preserve_original"参数并将其设置为true，因此它确实保留了原始单词。这很重要，因为用户可以同时搜索i Phone和iPhone，并且仍然会得分。

关于elasticsearch - 如何在像 “i phone”这样的Elasticsearch中匹配不匹配的单词，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33516502/

elasticsearch - 如何在像 “i phone”这样的Elasticsearch中匹配不匹配的单词

上一篇：javascript - .wav音频文件将无法播放-Javascript

下一篇：php - Elasticsearch排除结果