在elasticsearch文档的this part上,它可以重新实现瑞典分析器,如下所示:
PUT /swedish_example
{
"settings": {
"analysis": {
"filter": {
"swedish_stop": {
"type": "stop",
"stopwords": "_swedish_"
},
"swedish_keywords": {
"type": "keyword_marker",
"keywords": ["exempel"]
},
"swedish_stemmer": {
"type": "stemmer",
"language": "swedish"
}
},
"analyzer": {
"swedish": {
"tokenizer": "standard",
"filter": [
"lowercase",
"swedish_stop",
"swedish_keywords",
"swedish_stemmer"
]
}
}
}
}
我的问题是,此分析器如何识别关键字?当然,可以在
settings.analysis.filter.swedish_keywords.keywords
字段中定义关键字,但是如果我懒得这么做怎么办? Elasticsearch是否会查看其他一些预定义的瑞典语关键字列表?因为在上面的示例中,设置中似乎没有提供这样的列表。换句话说,是由我自己定义关键字还是由Elasticsearch默认查看其他列表以查找关键字?
最佳答案
是的,您需要由您指定此列表。否则,此过滤器将不会对执行任何操作。 Keyword Marker Token Filter Protects words from being modified by stemmers. Must be placed before
any stemming filters. A path (either relative to config location, or absolute) to a list of
words. A regular expression pattern to match against words in the text.
根据Elasticsearch的文档:
或者,您可以指定:keywords_path
keywords_pattern
有关此过滤器的更多信息-https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-keyword-marker-tokenfilter.html
关于elasticsearch - 瑞典分析仪使用哪些关键字?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50350606/