elasticsearch - Elasticsearch查询不适用于@值

标签 elasticsearch

当我对电子邮件执行简单的搜索查询时,它不会返回任何内容,除非我删除“@”之后的内容,为什么?

我希望以模糊和自动完成的方式对电子邮件进行查询。

flex 搜索信息:

{
  "name" : "ZZZ",
  "cluster_name" : "YYY",
  "cluster_uuid" : "XXX",
  "version" : {
    "number" : "6.5.2",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "WWW",
    "build_date" : "2018-11-29T23:58:20.891072Z",
    "build_snapshot" : false,
    "lucene_version" : "7.5.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

映射:
PUT users
{
  "mappings":
  {
    "_doc": { "properties": { "mail": { "type": "text" } } }
  }
}

所有数据:
[
    { "mail": "firstname.lastname@company.com" },
    { "mail": "john.doe@company.com" }
]

查询工作:

期限请求有效,但mail == "firstname.lastname@company.com"而不是“firstname.lastname” ...
QUERY :
GET users/_search
{ "query": { "term": { "mail": "firstname.lastname" } }}

RETURN :
{
  "took": 7,
  "timed_out": false,
  "_shards": { "total": 6, "successful": 6, "skipped": 0, "failed": 0 },
  "hits": {
    "total": 1,
    "max_score": 4.336203,
    "hits": [
      {
        "_index": "users",
        "_type": "_doc",
        "_id": "H1dQ4WgBypYasGfnnXXI",
        "_score": 4.336203,
        "_source": {
          "mail": "firstname.lastname@company.com"
        }
      }
    ]
  }
}

QUERY NOT WORKS:
QUERY :
GET users/_search
{ "query": { "term": { "mail": "firstname.lastname@company.com" } }}

RETURN :
{
  "took": 0,
  "timed_out": false,
  "_shards": { "total": 6, "successful": 6, "skipped": 0, "failed": 0 },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

解决方案:

使用uax_url_email分析器更改邮件的映射(映射更改后重新索引)。
PUT users
{
  "settings":
  {
    "index": { "analysis": { "analyzer": { "mail": { "tokenizer":"uax_url_email" } } } }
  }
  "mappings":
  {
    "_doc": { "properties": { "mail": { "type": "text", "analyzer":"mail" } } }
  }
}

最佳答案

如果您没有为索引文本字段使用其他标记器,它将使用标准标记器,该标记器会在@符号上标记化[我没有源,但是下面有证据]。

如果您使用术语查询而不是匹配查询,则将在倒排索引elasticsearch match vs term query中搜索该确切术语。

您的倒排索引看起来像这样

GET users/_analyze
{
  "text": "firstname.lastname@company.com"
}

{
  "tokens": [
    {
      "token": "firstname.lastname",
      "start_offset": 0,
      "end_offset": 18,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "company.com",
      "start_offset": 19,
      "end_offset": 30,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

要解决此问题,可以为邮件字段指定自己的分析器,也可以使用match查询,该查询将像分析索引文本一样分析搜索到的文本。
GET users/_search
{
  "query": {
    "match": {
      "mail": "firstname.lastname@company.com"
    }
  }
}

关于elasticsearch - Elasticsearch查询不适用于@值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54792132/

相关文章:

elasticsearch - 为什么在Elasticsearch中存在映射?

elasticsearch - elasticsearch短语_前缀预期结果

elasticsearch - Spring Data ElasticSearch 无法与 ElasticSearch 5.5.0 连接

elasticsearch - 从logstash/elasticsearch中删除某种类型的记录

java - ElasticSearch 查询没有返回好的结果

elasticsearch - ElasticSearch如何确保搜索中涉及所有单词

mysql - 并非所有来自 Logstash 的数据都被 Elasticsearch 索引

docker - 将 ElasticSearch Docker 容器部署到 AWS Fargate

elasticsearch - ElasticSearch:字词查询中的相交计数

elasticsearch - 启用安全性时无法启动logstash在 Elasticsearch 和Kibana上