elasticsearch - elasticsearch自定义分析器按特定字符

标签 elasticsearch

如何创建仅用'/'字符标记字段的自定义分析器。

我的字段中有用于exp的url字符串:“https://stackoverflow.com/questions/ask”
我想像这样标记:“http”，“stackoverflow.com”，“问题”和“询问”

最佳答案

这似乎可以使用pattern tokenizer完成您想要的操作:

PUT /test_index
{
   "settings": {
      "number_of_shards": 1,
      "analysis": {
         "analyzer": {
            "slash_analyzer": {
               "type": "pattern",
               "pattern": "[/:]+",
               "lowercase": true
            }
         }
      }
   },
   "mappings": {
      "doc": {
         "properties": {
            "url": {
               "type": "string",
               "index_analyzer": "slash_analyzer",
               "search_analyzer": "standard",
               "term_vector": "yes"
            }
         }
      }
   }
}

PUT /test_index/doc/1
{
   "url": "http://stackoverflow.com/questions/ask"
}

我在映射中添加了term vectors(您可能不想在生产中这样做)，因此我们可以看到生成了哪些术语:

GET /test_index/doc/1/_termvector
...
{
   "_index": "test_index",
   "_type": "doc",
   "_id": "1",
   "_version": 1,
   "found": true,
   "took": 1,
   "term_vectors": {
      "url": {
         "field_statistics": {
            "sum_doc_freq": 4,
            "doc_count": 1,
            "sum_ttf": 4
         },
         "terms": {
            "ask": {
               "term_freq": 1
            },
            "http": {
               "term_freq": 1
            },
            "questions": {
               "term_freq": 1
            },
            "stackoverflow.com": {
               "term_freq": 1
            }
         }
      }
   }
}

这是我使用的代码:

http://sense.qbox.io/gist/669fbdd681895d7e9f8db13799865c6e8be75b11

关于elasticsearch - elasticsearch自定义分析器按特定字符，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/32951811/

上一篇：elasticsearch - Elasticsearch hunspell过多地削减了单词

下一篇：php - Symfony Elastic FOSElasticaBundle搜索

grails - 字符串json到elasticSearch SearchResponse对象？

elasticsearch - 日期直方图汇总后如何再次执行日期直方图？

elasticsearch - Elasticsearch查询以非严格地从不同字段进行搜索

elasticsearch - Elasticsearch:在文本字段中根据搜索字符串的索引值对文档进行排序

java - 是否可以找到如果. Elasticsearch 文档中的单个属性被修改

node.js - 如果它属于特定类型的字段，则编写一个查询来提升结果

full-text-search - 如何构建 Elasticsearch 索引/类型？

java - 一个程序中有多个Elasticsearch连接

elasticsearch - 如何筛选 top_hits 指标聚合结果 [Elasticsearch]