elasticsearch - Elastic Search Scroll API对CPU的影响

滚动API对节点的CPU利用率有什么影响？我在ES version 6.2上滚动API的CPU使用率很高。

即使查询一次完成以获取所有数据，然后使用scroll_id提取数据，我们仍会遇到CPU高峰。

缓存的结果也存储在哪里？在内存还是磁盘上？

最佳答案

使用后应清除滚动“指针”。

Search context are automatically removed when the scroll timeout has been exceeded. However keeping scrolls open has a cost, as discussed in the previous section so scrolls should be explicitly cleared as soon as the scroll is not being used anymore using the clear-scroll API:

如描述here

Normally, the background merge process optimizes the index by merging together smaller segments to create new bigger segments, at which time the smaller segments are deleted. This process continues during scrolling, but an open search context prevents the old segments from being deleted while they are still in use. This is how Elasticsearch is able to return the results of the initial search request, regardless of subsequent changes to documents.

因此，如果您了解得很好，就不会有缓存。只是冻结您查询的目标，直到您的滚动过期。由于段在Lucene中是不可变的，因此可以确保您获得一致的结果，并且可以滚动创建滚动时存在的所有数据。但是缺点是只要您的滚动“指针”存在，目标句段将保持打开状态而不被删除。

因此，打开的段的数量将继续增加，并且必要的文件处理程序也将增加。因此，在广泛的查询中，尤其是如果您同时在建立索引，可能会导致性能问题。
由于当您建立索引时，会创建许多小片段，这些小片段之后应合并，但是如果对它们进行滚动查询，则无法完全合并和删除它们。

您是否在不断索引并且滚动持续时间有多长？

From documentation

关于elasticsearch - Elastic Search Scroll API对CPU的影响，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53833289/

elasticsearch - Elastic Search Scroll API对CPU的影响

上一篇：wpf - PowerShell 和 WPF GUI 卡住

下一篇：xml - 使用 PowerShell 下载 XML 并另存为 CSV 文件？