elasticsearch - 平面文档与嵌套文档的性能缺陷是什么？

我有自然适合文档的数据，例如

{
  "name": "Multi G. Enre",
  "books": [
    {
      "name": "Guns and lasers",
      "genre": "scifi",
      "publisher": "orbit"
    },
    {
      "name": "Dead in the night",
      "genre": "thriller",
      "publisher": "penguin"
    }
  ]
}

(示例取自 a good review 的嵌套和 has_child 文档)

为了在 Kibana 和其他软件(遗留和惰性的混合)中分析它们，它们被扁平化:

{
  "name": "Multi G. Enre",
  "book_name": "Guns and lasers",
  "book_genre": "scifi",
  "book_publisher": "orbit"
}
{
  "name": "Multi G. Enre",
  "book_name": "Dead in the night",
  "book_genre": "thriller",
  "book_publisher": "penguin"
}

除了索引大小的明显增长之外，查询此类扁平记录(查询类型为 "writer with scifi books from penguin" )与嵌套记录、父/子记录是否对性能有影响？

最佳答案

查询平面索引会好很多! noSQL 数据库背后的整个想法是对数据进行非规范化。

在您的第一个示例中，请注意每次添加一本书时都需要更新该记录。这是 ES/noSQL 中的一大禁忌。 ES 记录应该是不可变的。幕后更新真的是删除+插入，这是非常昂贵的。

关于elasticsearch - 平面文档与嵌套文档的性能缺陷是什么？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/35483099/

上一篇：ruby-on-rails - rmagick 导致服务器关闭

下一篇：google-translate - 当我使用谷歌翻译 API 时如何获得所有含义

c# - 使用 NEST : How to configure analyzers to find partial words? 的 Elasticsearch

spring - 使用 spring-data-elasticsearch 从索引中获取所有文档

ruby-on-rails - 通过 searchkick 进行多语言搜索

.net - 使用 Nest 客户端在 Elastic Search 中等效的 SQL IN 运算符

elasticsearch - 删除未使用的数据elasticsearch

elasticsearch - Elasticsearch 中两个日期之间的范围

java - 使用 Lucene 作为存储

elasticsearch - 匹配query_string文档的分数

java - 从集群收集指标