我目前正在尝试使用 python 将一些 YAML 转换为 JSON,但很难正确设置 JSON 格式。我的 YAML 文件有多个如下所示的文档:
title: Windows Shell Spawning Suspicious Program
status: experimental
description: Detects a suspicious child process of a Windows shell
references:
- https://mgreen27.github.io/posts/2018/04/02/DownloadCradle.html
author: Florian Roth
date: 20018/04/06
logsource:
product: windows
service: sysmon
detection:
selection:
EventID: 1
ParentImage:
- '*\mshta.exe'
- '*\powershell.exe'
- '*\cmd.exe'
- '*\rundll32.exe'
- '*\cscript.exe'
- '*\wscript.exe'
- '*\wmiprvse.exe'
Image:
- '*\schtasks.exe'
- '*\nslookup.exe'
- '*\certutil.exe'
- '*\bitsadmin.exe'
- '*\mshta.exe'
condition: selection
fields:
- CommandLine
- ParentCommandLine
falsepositives:
- Administrative scripts
level: medium
...
我想要做的是对于每个文档,提取检测、字段、误报和级别,并将它们作为单独的数组放入 JSON 文档中。我的第一次尝试非常糟糕,只是将每个文档中的组集中到列表中:
data = {}
data['indicator'] = {}
data['indicator']['detection']=[]
data['indicator']['fields']=[]
data['indicator']['false positives']=[]
data['indicator']['level']=[]
with open(yaml_file, 'r') as yaml_in, open(json_file, 'a') as definition:
loadyaml = yaml.safe_load_all(yaml_in)
for item in loadyaml:
for header, subsections in item.iteritems():
if header == 'detection':
data['indicator']['detection'].append(subsections)
elif header == 'fields':
data['indicator']['fields'].append(subsections)
elif header == 'false positives':
data['indicator']['false positives'].append(subsections)
elif header == 'level':
data['indicator']['level'].append(subsections)
json.dump(data, definition, indent=4)
我希望将我的每个文档作为单独的指标输入到我的 json 文档中,并将它们的检测、字段、dalspositives 和级别全部分组在一起 - 但我的 python 能力让我失望。
如果我能对此有任何见解,我将不胜感激!
最佳答案
您可以通过迭代 .load_all()
和一个更小的程序来获得所需的输出:
import sys
import ruamel.yaml
import json
yaml = ruamel.yaml.YAML(typ='safe')
ind = dict()
data = dict(indicator=ind)
for d in yaml.load_all(open('input.yaml')):
for k in ('detection', 'fields', 'falsepositives', 'level'):
ind.setdefault(k, []).append(d[k])
json.dump(data, sys.stdout, indent=2)
如果您有文件input.yaml
:
---
title: Windows Shell Spawning Suspicious Program
status: experimental
description: Detects a suspicious child process of a Windows shell
references:
- https://mgreen27.github.io/posts/2018/04/02/DownloadCradle.html
author: Florian Roth
date: 20018/04/06
logsource:
product: windows
service: sysmon
detection:
selection:
EventID: 1
ParentImage:
- '*\mshta.exe'
- '*\powershell.exe'
- '*\cmd.exe'
- '*\rundll32.exe'
- '*\cscript.exe'
- '*\wscript.exe'
- '*\wmiprvse.exe'
Image:
- '*\schtasks.exe'
- '*\nslookup.exe'
- '*\certutil.exe'
- '*\bitsadmin.exe'
- '*\mshta.exe'
condition: selection
fields:
- CommandLine
- ParentCommandLine
falsepositives:
- Administrative scripts
level: medium
...
---
title: Bash starting just what is asked
status: stabel
description: No negative side effects
references:
- https://nblue24.github.io/posts/2019/04/01/DownloadBed.html
author: Axel Roth
date: 2019/04/01
logsource:
product: linux
service: good
detection:
selection:
EventID: 42
ParentImage:
- '*/bash'
- '*/ash'
Image:
- systemctl
- init
condition: selection
fields:
- Shell
- ParentShell
falsepositives:
- root programs
level: high
...
您的输出将是:
{
"indicator": {
"detection": [
{
"selection": {
"EventID": 1,
"ParentImage": [
"*\\mshta.exe",
"*\\powershell.exe",
"*\\cmd.exe",
"*\\rundll32.exe",
"*\\cscript.exe",
"*\\wscript.exe",
"*\\wmiprvse.exe"
],
"Image": [
"*\\schtasks.exe",
"*\\nslookup.exe",
"*\\certutil.exe",
"*\\bitsadmin.exe",
"*\\mshta.exe"
]
},
"condition": "selection"
},
{
"selection": {
"EventID": 42,
"ParentImage": [
"*/bash",
"*/ash"
],
"Image": [
"systemctl",
"init"
]
},
"condition": "selection"
}
],
"fields": [
[
"CommandLine",
"ParentCommandLine"
],
[
"Shell",
"ParentShell"
]
],
"falsepositives": [
[
"Administrative scripts"
],
[
"root programs"
]
],
"level": [
"medium",
"high"
]
}
}
这适用于 Python 2 和 3。
关于Python:将多个 YAML 文档转换为 JSON,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51291788/