java - Elasticsearch : Retrieve long text field from a document

我有一个在 ES 中索引的文档。该文档有 3 个文本字段 F1 , F2和F3 .

当我尝试使用 Java API 搜索此文档时，我只有字段 F1 的值和F2 ，和字段F3显示为空。

QueryBuilder query =  //Some query

SearchResponse response = client.prepareSearch(index)
                .addDocValueField("F1.keyword")
                .addDocValueField("F2.keyword")
                .addDocValueField("F3.keyword")
                .setQuery(query)
                .execute()
                .actionGet();

SearchHit hit = response.getHits().getAt(0);

System.out.println("F1 : "+hit.getField("F1.keyword").getValue());
System.out.println("F2 : "+hit.getField("F2.keyword").getValue());
System.out.println("F3 : "+hit.getField("F3.keyword").getValue()); // empty

我的领域F3可以很长。在我用于测试的文档中，它包含超过 300 个字符，并且可能更长。

我的索引映射是:

"mappings": {
      "MyIndex": {
        "properties": {
          "F1": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "F2": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "F3": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }

所以我更新了ignore_above F3 映射中的字段最多 20000(可能是一个坏主意？)，但我仍然有相同的行为。

问题是什么，正确的方法是什么？

注释:

使用 ES 5.6.3
我不需要在该领域进行任何分析/术语搜索 F3 ，仅当查询匹配 F1 时才检索它的值或F2 .
我只会有少量此类文档，因此效率不会是一个大问题

编辑:

奇怪的是，当我使用浏览器通过查询请求elasticsearch时，我得到了预期的结果:

http://localhost:9200/MyIndex/_search?pretty=true?{"query": {"match_all": {}}}

最佳答案

在 Elasticsearch 中，默认行为将文本字符串映射为两种不同的 Elasticsearch 类型:text 和 keyword。它们是不同的东西，用于不同的目的，主要是 text 是全文搜索字段，而 keyword 就像一个结构化常量值。阅读更多 the docs

就您而言，默认包含 keyword 字段看起来没有帮助。在您的查询中，您应该只获取“常规”F3 字段，和/或 F1 和 F2 的常规字段。

最后，我对 ES Java 客户端不太熟悉，但如果您想进行源过滤(即仅从请求中获取值的子集)，我不认为 addDocValueField() 是对的。查看:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/5.6/java-rest-high-search.html#_source_filtering

关于java - Elasticsearch : Retrieve long text field from a document，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57725906/

java - Elasticsearch : Retrieve long text field from a document

上一篇：java - Apache Flink Google Pub/Sub 连接器在 intellij idea 中运行抛出 NoClassDefFoundError

下一篇：java - 循环外变量标记为 'DU' - PMD 中的异常