elasticsearch - Elasticsearch 中数据过大的根本原因

我正在努力寻找问题的确切原因:

"type": "circuit_breaking_exception",
        "reason": "[fielddata] Data too large, data for [_id] would be [1704048152/1.5gb], which is larger than the limit of [1704040857/1.5gb]",
        "bytes_wanted": 1704048152,
        "bytes_limit": 1704040857,
        "durability": "PERMANENT"

它发生在我的AWS Elasticsearch 服务器上，我认为内存可能是一个问题，因此在我的本地笔记本电脑上，我为-xms tp 32 mb和-xmx分配了64 mb，并尝试在大约1 000 000条记录后将数据插入索引中，但出现错误:

circuit_breaking_exception
"reason": "[parent] Data too large

我无法获得与AWS Elasticsearch 完全相同的错误
我重现了我插入了超过3500000条记录的问题，但是仍然没有在本地获取该异常
我是 Elasticsearch 的新手，我想知道我需要进行哪些更改，以便可以避免在AWS Elasticsearch 上出现此问题
AWS Elasticsearch 的配置为:
Elasticsearch版本7.4
实例类型(数据)c5.xlarge.elasticsearch
EBS体积大小60 GiB
子句数上限:1024
字段数据缓存分配:无界(默认)
让我，如果需要更多详细信息

最佳答案

在估计字段为了加载到JVM堆中所需的内存量时，将考虑上述字段数据断路器。如果操作堆将超出限制，则通过引发异常来防止字段数据加载。默认情况下，它配置为最大JVM堆的40％。也可以对其进行配置(请参阅https://www.elastic.co/guide/en/elasticsearch/reference/current/circuit-breaker.html#fielddata-circuit-breaker)，此处还有您应该注意的相关设置:https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-fielddata.html
看来，该节点已过载。增加JVM堆。如果不可行，则添加另一个节点，以将分片分布在更多实例上。

关于elasticsearch - Elasticsearch 中数据过大的根本原因，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/63021295/

上一篇：python - 使用 Python 处理音频信号

下一篇：html - 音频内容类型 aacp

相关文章：

docker - 在多个syslog docker来源之间进行区分

elasticsearch - 从Elasticsearch中返回受影响索引的列表

elasticsearch - 如何修复关键字字段的 ElasticSearch ‘Fielddata is disabled on text fields by default’

amazon-web-services - 是否可以将functionbeat与AWS Elasticsearch Service集成？

python - AWS Elasticsearch 错误 "[Errno 8] nodename nor servname provided, or not known."

amazon-web-services - 使用 key 凭证连接到 AWS 中的 elasticsearch

elasticsearch - 从/usr/local/var卸载elasticsearch并在MacOSX上安装另一个版本

datetime - 日期之间的 Elasticsearch 计数

ruby-on-rails - 显示标题而不是搜索结果中的ID

amazon-web-services - Amazon Elasticsearch Service Kibana的错误: “Tenant indices migration failed”