elasticsearch - 为什么查询 “apache”在Elasticsearch中的以下文档中不起作用?

标签 elasticsearch

我有一个简单的文本文档,该文档使用以下命令来获取:curl -X GET "localhost:9200/customer/_doc/1"

{"_index":"customer","_type":"_doc","_id":"1","_version":1,"found":true,"_source":
{
  "description": "Sun Java Plug-In 1.4 through 1.4.2_02 allows remote attackers to repeatedly access the floppy drive via the createXmlDocument method in the org.apache.crimson.tree.XmlDocument class, which violates the Java security model."
}
}

当我用下面提到的查询对上述文档进行查询时, flex 搜索没有给出任何匹配项,我想知道为什么吗?
{
    "query": {
        "match" : {
            "description": "apache"
        }
    }
}

如果我用createXmlDocumentorg.apache.crimson.tree.XmlDocument替换apache,则此查询成功。我最初的理解是org.apache.crimson.tree.XmlDocument将被分为5个字org,apache,crimson,tree和XmlDocument,但目前我想也许是整个org.apache.crimson.tree.XmlDocument被存储了就像通过 flex 搜索一样。如果是这样,为什么以及如何获得期望的结果?

最佳答案

如果您未定义任何内容,则将使用standard analyzer

标准分析器将创建以下 token :

{
  "token" : "org.apache.crimson.tree.xmldocument",
  "start_offset" : 140,
  "end_offset" : 175,
  "type" : "<ALPHANUM>",
  "position" : 22
}

因此,您的搜索找不到任何东西。如果使用Pattern Analyzer,则将创建 token apache。默认模式\W+(每个字)对您都有效。

你可以用
curl -XGET "http://localhost:9200/_analyze" -H 'Content-Type: application/json' -d'
{
  "text": "Sun Java Plug-In 1.4 through 1.4.2_02 allows remote attackers to repeatedly access the floppy drive via the createXmlDocument method in the org.apache.crimson.tree.XmlDocument class, which violates the Java security model.",
  "analyzer": "pattern"
}'

为您的索引定义一个明确的映射,如下所示:
PUT customer
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  },
  "mappings": {
    "_doc": {
      "properties": {
        "description": {
          "type": "text",
          "analyzer": "pattern"
        }
      }
    }
  }
}

如果再次运行查询,将得到例如:
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "customer",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "description" : "Sun Java Plug-In 1.4 through 1.4.2_02 allows remote attackers to repeatedly access the floppy drive via the createXmlDocument method in the org.apache.crimson.tree.XmlDocument class, which violates the Java security model."
        }
      }
    ]
  }

关于elasticsearch - 为什么查询 “apache”在Elasticsearch中的以下文档中不起作用?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54296864/

相关文章:

elasticsearch - 如何在Elasticsearch中搜索字段数组

elasticsearch - 从 elasticsearch Node.js 客户端 v7.2.0 获取不正确的聚合查询响应

elasticsearch - 带有过滤器的ElasticSearch query_string无法获取结果

java - 在 ElasticSearch API 应用程序中找不到 NodeBuilder

elasticsearch - elasticsearch字段映射会影响同一索引中的不同类型

elasticsearch - Grafana在Ansible上提供Elasticsearch数据存储

elasticsearch - 日期范围查询中的 Elasticsearch 间隔

elasticsearch - ElasticSearch映射:是否可以自动截断日期以适合其格式?

elasticsearch - 配置ELK + log4j

json - 如何使用Postman在Elasticsearch中映射列(非法参数异常)