postgresql - Logstash内存不足,读取Postgres大表

标签 postgresql elasticsearch logstash

我试图索引一个大于10.000.000的大型数据库表
AND logstash内存不足。.::

错误:

logstash_1       | Error: Your application used more memory than the safety cap of 1G.
logstash_1       | Specify -J-Xmx####m to increase it (#### = cap size in MB).
logstash_1       | Specify -w for full OutOfMemoryError stack trace

我的logstash配置:
input {
    jdbc {
        # Postgres jdbc connection string to our database, mydb
        jdbc_connection_string => "jdbc:postgresql://database:5432/predictiveparking"
        # The user we wish to execute our statement as
        jdbc_user => "predictiveparking"
        jdbc_password => "insecure"
        # The path to our downloaded jdbc driver
        jdbc_driver_library => "/app/postgresql-9.4.1212.jar"
        # The name of the driver class for Postgresql
        jdbc_driver_class => "org.postgresql.Driver"
        # our query
        statement => "SELECT * from scans_scan limit 10"
    }
}


#output {
#    stdout { codec => json_lines }
#}

output {
    elasticsearch {
    index => "scans"
    sniffing => false
    document_type => "scan"
    document_id => "id"
    hosts => ["elasticsearch"]
    }
}

最佳答案

只是启用分页。

添加:

jdbc_paging_enabled => true

现在,将数据表单数据库切成碎片,并且我们不会用完内存。确保sql查询是ORDERED!

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html#plugins-inputs-jdbc-jdbc_paging_enabled

关于postgresql - Logstash内存不足,读取Postgres大表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41610313/

相关文章:

java - JPQL 按子列计算多个多对一和组计数

ElasticSearch _suggest 查询区分大小写。希望它们不区分大小写

memory - redis内存增长infi

authentication - 限制特定仪表板中的某些 kibana 用户

clojure - 提醒黎曼?

python - 直接在 Python 中重新创建 Postgres COPY?

performance - 数据库速度优化 : few tables with many rows, 或多表少行?

ElasticSearch:使用字段折叠时是否可以返回所有没有折叠键值的文档?

javascript - 跨域请求错误: No 'Access-Control-Allow-Origin' header is present on the requested resource

mysql - gitlab 健康检查不健康