我正在尝试更新原始索引设置。
我的初始设置如下所示:
client.create(index = "movies", body= {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"filter": {
"my_custom_stop_words": {
"type": "stop",
"stopwords": stop_words
}
},
"analyzer": {
"my_custom_analyzer": {
"filter": [
"lowercase",
"my_custom_stop_words"
],
"type": "custom",
"tokenizer": "standard"
}
}
}
},
"mappings": {
"properties": {
"body": {
"type": "text",
"analyzer": "my_custom_analyzer",
"search_analyzer": "my_custom_analyzer",
"search_quote_analyzer": "my_custom_analyzer"
}
}
}
},
ignore=400
)
我正在尝试使用client.put_settings将同义词过滤器添加到现有分析器(my_custom_analyzer)中:
client.put_settings(index='movies', body={
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"filter": [
"lowercase",
"my_stops",
"my_synonyms"
],
"type": "custom",
"tokenizer": "standard"
}
},
"filter": {
"my_custom_stops": {
"type": "stop",
"stopwords": stop_words
},
"my_custom_synonyms": {
"ignore_case": "true",
"type": "synonym",
"synonyms": ["Harry Potter, HP => HP", "Terminator, TM => TM"]
}
}
}
},
"mappings": {
"properties": {
"body": {
"type": "text",
"analyzer": "my_custom_analyzer",
"search_analyzer": "my_custom_analyzer",
"search_quote_analyzer": "my_custom_analyzer"
}
}
}
},
ignore=400
)
但是,当我发出搜索查询(搜索“HP”)以查询电影索引时,我正在尝试对文档进行排名,以使包含“哈利·波特” 5次的文档成为列表中的顶部元素。现在,似乎“HP” 3倍的文档在列表中居首位,因此同义词过滤器不起作用。在执行client.put_settings之前,我已经关闭了电影索引,然后重新打开了索引。
任何帮助将不胜感激!
最佳答案
您应该重新索引所有数据,以便将更新的设置应用于所有数据和字段。
已建立索引的数据将不受更新的分析器的影响,只有在更新设置后已建立索引的文档才会受到影响。
不重新索引数据可能会产生错误的结果,因为旧数据是使用旧的自定义分析器而不是新的自定义分析器进行分析的。
解决此问题的最有效方法是创建一个新索引,并使用更新的设置将数据从旧索引移到新索引。
Reindex Api
跟着这些步骤:
POST _reindex
{
"source": {
"index": "movies"
},
"dest": {
"index": "new_movies"
}
}
DELETE movies
PUT movies
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"filter": [
"lowercase",
"my_custom_stops",
"my_custom_synonyms"
],
"type": "custom",
"tokenizer": "standard"
}
},
"filter": {
"my_custom_stops": {
"type": "stop",
"stopwords": "stop_words"
},
"my_custom_synonyms": {
"ignore_case": "true",
"type": "synonym",
"synonyms": [
"Harry Potter, HP => HP",
"Terminator, TM => TM"
]
}
}
}
},
"mappings": {
"properties": {
"body": {
"type": "text",
"analyzer": "my_custom_analyzer",
"search_analyzer": "my_custom_analyzer",
"search_quote_analyzer": "my_custom_analyzer"
}
}
}
}
POST _reindex?wait_for_completion=false
{
"source": {
"index": "new_movies"
},
"dest": {
"index": "movies"
}
}
验证所有数据到位后,您可以删除
new_movies
索引。 DELETE new_movies
希望这些帮助
关于python - Elasticsearch-IndicesClient.put_settings无法正常工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58896418/