elasticsearch - Elasticsearch 6.4在创建自定义字符过滤器时引发错误

标签 elasticsearch search elasticsearch-analyzers

因此,我很确定我在语法中缺少某些内容,但似乎无法弄清楚到底是什么。我正在尝试创建定义为here的电话号码模式捕获 token 过滤器。它说先定义一个关键字过滤器,然后在顶部应用模式捕获 token 。这就是我所做的:

{
    "mappings": {
        "_doc": {
            "properties": {
                "phone": {
                    "type": "text",
                    "analyzer": "my_phone_analyzer"
                }
            }
        }
    },
    "settings": {
        "analysis": {
            "analyzer": {
                "my_phone_analyzer": {
                    "type": "custom",
                    "tokenizer": "keyword",
                    "char_filter": [
                        "phone_number"
                    ]
                }
            }
        },
        "char_filter": {
            "phone_number": {
                "type": "pattern_capture",
                "preserve_original": 1,
                "patterns": [
                    "1(\\d{3}(\\d+))"
                ]
            }
        }
    }
}

这导致以下错误:
{
    "error": {
        "root_cause": [
            {
                "type": "illegal_argument_exception",
                "reason": "unknown setting [index.char_filter.phone_number.patterns] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
            }
        ],
        "type": "illegal_argument_exception",
        "reason": "unknown setting [index.char_filter.phone_number.patterns] please check that any required plugins are installed, or check the breaking changes documentation for removed settings",
        "suppressed": [
            {
                "type": "illegal_argument_exception",
                "reason": "unknown setting [index.char_filter.phone_number.preserve_original] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
            },
            {
                "type": "illegal_argument_exception",
                "reason": "unknown setting [index.char_filter.phone_number.type] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
            }
        ]
    },
    "status": 400
}

如果有人能指出我做错了,那太好了!

最佳答案

您提到的链接看起来很旧。
pattern_capture不再适用于 char_filter ,而仅适用于 token filter

如果您使用的是高于5.x的Elasticsearch,下面是您的映射

PUT <your_index_name>
{  
   "mappings":{  
      "_doc":{  
         "properties":{  
            "phone":{  
               "type":"text",
               "analyzer":"my_phone_analyzer"
            }
         }
      }
   },
   "settings":{  
      "analysis":{  
         "analyzer":{  
            "my_phone_analyzer":{  
               "type":"custom",
               "tokenizer":"keyword",
               "filter":[  
                  "phone_number"
               ]
            }
         },
         "filter":{  
            "phone_number":{  
               "type":"pattern_capture",
               "preserve_original":true,
               "patterns":[  
                  "1(\\d{3}(\\d+))"
               ]
            }
         }
      }
   }
}

您可以使用 Analyze API 来查看生成了哪些 token ,如下所述:
POST <your_index_name>/_analyze
{
  "analyzer": "my_phone_analyzer",
  "text": "19195557321"
}

代币:
{
  "tokens" : [
    {
      "token" : "19195557321",
      "start_offset" : 0,
      "end_offset" : 11,
      "type" : "word",
      "position" : 0
    },
    {
      "token" : "9195557321",
      "start_offset" : 0,
      "end_offset" : 11,
      "type" : "word",
      "position" : 0
    },
    {
      "token" : "5557321",
      "start_offset" : 0,
      "end_offset" : 11,
      "type" : "word",
      "position" : 0
    }
  ]
}

希望有帮助!

关于elasticsearch - Elasticsearch 6.4在创建自定义字符过滤器时引发错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56493191/

相关文章:

jdbc - logstash显示未知设置aws_access_key_id

elasticsearch - _type在ElasticSearch中的使用范围从5.5到7.7

curl - 自定义分析器在Elasticsearch中不起作用

elasticsearch - 将包含 <number><unit> 的文本拆分为 3 个标记

elasticsearch - elasticsearch需要在 bool 中添加一个必须的查询

logging - 使用logstash(ELK)轮询日志

php - MySQL 全外连接

javascript - 最快的 JavaScript 页面搜索

c++ - 使用 C++ 查找文件中的所有重复模式

elasticsearch - Elasticsearch 全文搜索