我用字段“名称”创建了两个索引fashion和mobiles。
client.indices.create(index='fashion',body={"mappings": {"doc": {"properties": {"name": {"type": "string"} } } } })
client.indices.create(index='mobiles',body={"mappings": {"doc": {"properties": {"name": {"type": "string"} } } } })
对于Fashion,添加了以下文档。
client.index(index='mobiles',doc_type='blog',body={"query":{ "name": "i shirts" }})
client.index(index='mobiles',doc_type='blog',body={"query":{ "name": "i celekon" }})
client.index(index='mobiles',doc_type='blog',body={"query":{ "name": "satsung" }})
对于手机:
client.index(index='mobiles',doc_type='blog',body={"query":{ "name": "apple iphone 6s" }})
client.index(index='mobiles',doc_type='blog',body={"query":{ "name": "samsung galaxy s2" }})
client.index(index='mobiles',doc_type='blog',body={"query":{ "name": "apple iphone 5s" }})
当我使用匹配查询来搜索类似
search="i phone"
test=client.search(index='mobiles,fashion',doc_type='blog',size=10,body={"query": {"bool" : {"should" : [{"match": {"name": {"query":search,"slop": 10,"max_expansions": 2 }}},{"match_phrase_prefix": {"name": {"query":search,"slop": 10,"max_expansions": 2}}},{"match": {"name": {"query":search, "fuzziness":1}}}]}}})
我按以下顺序获得结果。
i shirts , i celekon , apple iphone 6s , apple iphone 5s
如何追踪结果?
apple iphone 6s , apple iphone 5s, ....
“amazon”,“flipkart”如何实现这些类型的搜索?
注意:我使用elasticsearch-py api进行搜索。
最佳答案
您必须创建一个使用Word Delimiter Token Filter的自定义分析器:
Named
word_delimiter
, it splits words into subwords and performs optional transformations on subword groups. Words are split into subwords with the following rules:
- split on intra-word delimiters (by default, all non alpha-numeric
characters). "Wi-Fi" → "Wi", "Fi"- split on case transitions: "PowerShot" → "Power", "Shot"
- split on letter-number transitions: "SD500" → "SD", "500"
- leading and trailing intra-word delimiters on each subword are ignored: "//hello---there, dude" → "hello", "there", "dude"
- trailing "'s" are removed for each subword: "O’Neil’s" → "O", "Neil"
我认为您正在寻找第二个例子。如果您要为
iPhone
编制索引,它将创建 token "i"
和"Phone"
,这正是您要寻找的。要记住的一件事是,您应该在此处照看
"preserve_original"
参数并将其设置为true,因此它确实保留了原始单词。这很重要,因为用户可以同时搜索i Phone和iPhone,并且仍然会得分。
关于elasticsearch - 如何在像 “i phone”这样的Elasticsearch中匹配不匹配的单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33516502/