我试图索引一个大于10.000.000的大型数据库表
AND logstash内存不足。.::
错误:
logstash_1 | Error: Your application used more memory than the safety cap of 1G.
logstash_1 | Specify -J-Xmx####m to increase it (#### = cap size in MB).
logstash_1 | Specify -w for full OutOfMemoryError stack trace
我的logstash配置:
input {
jdbc {
# Postgres jdbc connection string to our database, mydb
jdbc_connection_string => "jdbc:postgresql://database:5432/predictiveparking"
# The user we wish to execute our statement as
jdbc_user => "predictiveparking"
jdbc_password => "insecure"
# The path to our downloaded jdbc driver
jdbc_driver_library => "/app/postgresql-9.4.1212.jar"
# The name of the driver class for Postgresql
jdbc_driver_class => "org.postgresql.Driver"
# our query
statement => "SELECT * from scans_scan limit 10"
}
}
#output {
# stdout { codec => json_lines }
#}
output {
elasticsearch {
index => "scans"
sniffing => false
document_type => "scan"
document_id => "id"
hosts => ["elasticsearch"]
}
}
最佳答案
只是启用分页。
添加:
jdbc_paging_enabled => true
现在,将数据表单数据库切成碎片,并且我们不会用完内存。确保sql查询是ORDERED!
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html#plugins-inputs-jdbc-jdbc_paging_enabled
关于postgresql - Logstash内存不足,读取Postgres大表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41610313/