scala - elastic4s:如何添加分析器/过滤器为german_phonebook进行分析?

标签 scala elasticsearch elastic4s

如何使用elastic4s将以下german_phonebook分析器添加到 flex 搜索中?

        "index": {
            "analysis": {
                "analyzer": {
                    "german": {
                        "filter": [
                            "lowercase",
                            "german_stop",
                            "german_normalization",
                            "german_stemmer"
                        ],
                        "tokenizer": "standard"
                    },
                    "german_phonebook": {
                        "filter": [
                            "german_phonebook"
                        ],
                        "tokenizer": "keyword"
                    },
                    "mySynonyms": {
                        "filter": [
                            "lowercase",
                            "mySynonymFilter"
                        ],
                        "tokenizer": "standard"
                    }
                },
                "filter": {
                    "german_phonebook": {
                        "country": "CH",
                        "language": "de",
                        "type": "icu_collation",
                        "variant": "@collation=phonebook"
                    },
                    "german_stemmer": {
                        "language": "light_german",
                        "type": "stemmer"
                    },
                    "german_stop": {
                        "stopwords": "_german",
                        "type": "stop"
                    },
                    "mySynonymFilter": {
                        "synonyms": [
                            "swisslift,lift"
                        ],
                        "type": "synonym"
                    }
                }
            },

这里的核心问题是对icu_collat​​ion类型的german_phonebook过滤器使用哪个过滤器?

...

在回答之后,我想到了以下代码:
  case class GPhonebook() extends TokenFilterDefinition {
    val filterType = "phonebook"
    def name = "german_phonebook"
    override def build(source: XContentBuilder): Unit = {
      source.field("tokenizer", "keyword")
      source.field("country", "CH")
      source.field("language", "de")
      source.field("type", "icu_collation")
      source.field("variant", "@collation=phonebook")  
    }
  }

分析器的定义现在看起来像这样:
  CustomAnalyzerDefinition(
      "german_phonebook",
      KeywordTokenizer("myKeywordTokenizer2"),
      GPhonebook()
  )

最佳答案

你真正想要的就是说
CustomTokenFilter("german_phonebook)BuiltInTokenFilter("german_phonebook"),但您不能(我将其添加)。

因此,现在,您需要扩展TokenFilterDefinition

例如,类似

case class GPhonebook extends TokenFilterDefinition {
  val filterType = "phonebook"
  override def build(source: XContentBuilder): Unit = {
    // set extra params in here
  }
}

关于scala - elastic4s:如何添加分析器/过滤器为german_phonebook进行分析?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35580412/

相关文章:

scala - SCALA和Elastic Search:将符号添加到类路径(Databricks)

java - org.elasticsearch.client.transport.NoNodeAvailableException : None of the configured nodes are available: []

unit-testing - 模拟elastic4s客户端获取类型不匹配,如何模拟elastic4s客户端

Scala:从 Elasticsearch 获取超过 10000 个文档/消息

scala - 如何解释多行scala字符串中的转义序列?

scala - 如何在 Spark ML 中使用 CountVectorizer 计算单词的频率?

arrays - scala 数组[字节] diff

Node.js ElastiSearch 客户端。如何在 _msearch 方法中使用 filter_path 参数

elasticsearch - 术语查询在 Elasticsearch 中不起作用?

elasticsearch - 使用elasticsearch在分析器中定义一个停用词表