我正在尝试提取根据json fileformat生成的库存数据。
{
"_meta":{
"hostvars":{
"host1":{
"foreman":{
"architecture_id":1,
"architecture_name":"x86_64",
"capabilities":[
"build"
],
"certname":"host1",
"comment":"this is hostname1",
"created_at":"2017-03-08T15:27:11Z",
"disk":"10gb",
"domain_id":5,
},
"foreman_facts":{
"boardmanufacturer":"Intel Corporation",
"boardproductname":"440BX Desktop Reference Platform",
"ipaddress":"1.1.1.1",
"ipaddress_eth0":"1.1.1.2",
"ipaddress_lo":"127.0.0.1",
},
"foreman_params":{
}
},
"host2":{
"foreman":{
"architecture_id":1,
"architecture_name":"x86_64",
"capabilities":[
"build"
],
"certname":"host2",
"comment":"this hostname2",
"created_at":"2017-03-08T15:27:11Z",
"disk":"20gb",
"domain_id":5,
},
"foreman_facts":{
"boardmanufacturer":"Intel Corporation",
"boardproductname":"440BX Desktop Reference Platform",
"ipaddress":"2.1.1.1",
"ipaddress_eth0":"2.2.2.2",
"ipaddress_lo":"127.0.0.1",
},
"foreman_params":{
}
},
"foreman_all":[
"host3",
"host4",
],
"foreman_environment: [
"computer1",
"computer2"
],
使用以下代码设法在ElasticSeach中获取数据。
文件节拍配置:
multiline.pattern: '^{'
multiline.negate: true
multiline.match: after
output.logstash:
# The Logstash hosts
hosts: ["localhost:5044"]
Logstash:
input {
beats {
port => "5044"
}
}
output {
elasticsearch {
hosts => [ "10.1.7.5:9200" ]
index => "inventory-%{+YYYY-MM-dd}"
}
stdout {}
}
但是我注意到filebeat将整个json文件视为一条消息。想知道我是否可以中断消息,仅发送 hostvars 部分并根据每个主机名索引文档,而忽略上述json数据中的 foreman_all 和 foreman_environment 字段。上面是示例数据,我必须提取大约10万条记录,因此要确保我在网络上发送的数据尽可能少。
我想在Elasticsearch中以以下格式摄取数据。想知道是否有人可以建议使用最佳配置。
flex 文件ID 1
computer name : "host1"
"architecture_id": 1,
"architecture_name": "x86_64",
"capabilities": ["build"],
"Company hardware name": "host1",
"comment": "this is hostname1",
"created_at": "2017-03-08T15:27:11Z",
"disk": "10gb",
"domain_id": 5,
"foreman_facts": {
"boardmanufacturer": "Intel Corporation",
"boardproductname": "440BX Desktop Reference Platform",
"ipaddress": "1.1.1.1",
"ipaddress_eth0": "1.1.1.2",
"ipaddress_lo": "127.0.0.1",
flex 文件ID 2
"computer name"" : "host2"
"architecture_id": 1,
"architecture_name": "x86_64",
"capabilities": ["build"],
"certname": "host2",
"comment": "this hostname2",
"created_at": "2017-03-08T15:27:11Z",
"disk": "20gb",
"domain_id": 5,
"boardmanufacturer": "Intel Corporation",
"boardproductname": "440BX Desktop Reference Platform",
"ipaddress": "2.1.1.1",
"ipaddress_eth0": "2.2.2.2",
"ipaddress_lo": "127.0.0.1",
最佳答案
document_type
:filebeat:
prospectors:
- input_type: log
paths:
- "/home/ubuntu/data/test.json"
document_type: json
json.message_key: log
json.keys_under_root: true
json.overwrite_keys: true
看看这可能会有帮助:https://www.elastic.co/blog/structured-logging-filebeat
json {
source => "parameter"
target => "parameterData"
remove_field => "parameter"
}
文件:https://www.elastic.co/guide/en/logstash/current/plugins-filters-json.html
std_in
和std_out
进行测试。 关于json - Json文件从Filebeat到Logstash,然后到elasticsearch,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47412271/