elasticsearch - Elasticsearch 和Logstash性能调优

在带有logstash的单节点Elastic Search中，我们在不同类型的AWS实例(即Medium，Large和Xlarge)上将20mb和200mb文件解析为Elastic Search进行了测试。

环境详细信息:中型实例3.75 RAM 1核存储:4 GB SSD 64位网络性能:中等
使用以下命令运行实例:Logstash， flex 搜索

场景:1

**With default settings** 
Result :
20mb logfile 23 mins Events Per/second 175
200mb logfile 3 hrs 3 mins Events Per/second 175


Added the following to settings:
Java heap size : 2GB
bootstrap.mlockall: true
indices.fielddata.cache.size: "30%"
indices.cache.filter.size: "30%"
index.translog.flush_threshold_ops: 50000
indices.memory.index_buffer_size: 50%

# Search thread pool
threadpool.search.type: fixed
threadpool.search.size: 20
threadpool.search.queue_size: 100

**With added settings** 
Result:
20mb logfile 22 mins Events Per/second 180
200mb logfile 3 hrs 07 mins Events Per/second 180

方案2

环境详细信息:R3大型15.25 RAM 2核存储:32 GB SSD 64位网络性能:中等
使用以下命令运行实例:Logstash， flex 搜索

**With default settings** 
Result :
  20mb logfile 7 mins Events Per/second 750
  200mb logfile 65 mins Events Per/second 800

Added the following to settings:
Java heap size: 7gb
other parameters same as above

**With added settings** 
Result:
20mb logfile 7 mins Events Per/second 800
200mb logfile 55 mins Events Per/second 800

场景3

环境细节:
R3高内存超大r3.xlarge 30.5 RAM 4核存储:32 GB SSD 64位网络性能:中等
使用以下命令运行实例:Logstash， flex 搜索

**With default settings** 
  Result:
  20mb logfile 7 mins Events Per/second 1200
  200mb logfile 34 mins Events Per/second 1200

 Added the following to settings:
    Java heap size: 15gb
    other parameters same as above

**With added settings** 
Result:
    20mb logfile 7 mins Events Per/second 1200
    200mb logfile 34 mins Events Per/second 1200

我想知道

性能基准是什么？

性能是否达到基准或是否低于基准

为什么即使我增加了Elasticsearch JVM，我也无法找到区别？

如何监视Logstash并提高其性能？

感谢对此的任何帮助，因为它们是Logstash和 flex 搜索的新手。

最佳答案

我认为这种情况与Logstash使用固定大小的队列(The Logstash event processing pipeline)有关

Logstash sets the size of each queue to 20. This means a maximum of 20 events can be pending for the next stage. The small queue sizes mean that Logstash simply blocks and stalls safely when there’s a heavy load or temporary pipeline problems. The alternatives would be to either have an unlimited queue or drop messages when there’s a problem. An unlimited queue can grow unbounded and eventually exceed memory, causing a crash that loses all of the queued messages.

我认为您应该尝试使用'-w'标志增加工作人员人数。

另一方面，许多人说Logstash应该水平缩放，而不是增加更多的核心和GB的ram(How to improve Logstash performance)

关于elasticsearch - Elasticsearch 和Logstash性能调优，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28579481/

elasticsearch - Elasticsearch 和Logstash性能调优

上一篇：php - 使用 bool 的 Elasticsearch 过滤器查询返回无效结果

下一篇：elasticsearch - 如何将Kibana用于多用户