python - 在python中的json文件中存储具有子列表的字符串列表

标签 python json python-2.7

我正在使用 python,我有这样的数据:

RedHat Enterprise Linux ES 2.1 IA64
RedHat Enterprise Linux ES 2.1
Red Hat Enterprise Linux AS 2.1
Linux kernel 2.6.9 
Linux kernel 2.6.8 rc3
Linux kernel 2.6.8 rc1
    + Ubuntu Ubuntu Linux 4.1 ppc
    + Ubuntu Ubuntu Linux 4.1 ia64
Linux kernel 2.6.8 

我想将此信息存储在 json 文件中。但我不知道如何! 就像我有一个 RedHats 列表,Linux 和 Ubuntu 是 Linux 内核 2.6.8 rc1 的子列表,如下列表所示:

{"RedHat Enterprise Linux ES 2.1 IA64":{} ,"RedHat Enterprise Linux ES 2.1":{} ,"Red Hat Enterprise":{"Linux AS 2.1","Linux kernel 2.6.9","Linux kernel 2.6.8 rc3","Linux kernel 2.6.8 rc1"},"Linux kernel 2.6.8":{}}

这是我的整个字符串:

'RedHat Enterprise Linux WS  2.1 IA64RedHat Enterprise Linux WS  2.1RedHat Enterprise Linux ES  2.1 IA64RedHat Enterprise Linux ES  2.1Red Hat Enterprise Linux AS  2.1 IA64Red Hat Enterprise Linux AS  2.1Linux kernel 2.6.9 Linux kernel 2.6.8 rc3Linux kernel 2.6.8 rc2Linux kernel 2.6.8 rc1+ Ubuntu Ubuntu Linux 4.1 ppc+ Ubuntu Ubuntu Linux 4.1 ia64+ Ubuntu Ubuntu Linux 4.1 ia32Linux kernel 2.6.8 Linux kernel 2.6.7 rc1Linux kernel 2.6.7 Linux kernel 2.6.6 rc1Linux kernel 2.6.6 Linux kernel 2.6.5 Linux kernel 2.6.4 Linux kernel 2.6.3 Linux kernel 2.6.2 Linux kernel 2.6.1 -rc2Linux kernel 2.6.1 -rc1Linux kernel 2.6.1 Linux kernel 2.6 .10Linux kernel 2.6 -test9-CVSLinux kernel 2.6 -test9Linux kernel 2.6 -test8Linux kernel 2.6 -test7Linux kernel 2.6 -test6Linux kernel 2.6 -test5Linux kernel 2.6 -test4Linux kernel 2.6 -test3Linux kernel 2.6 -test2Linux kernel 2.6 -test11Linux kernel 2.6 -test10Linux kernel 2.6 -test1Linux kernel 2.6 Linux kernel 2.4.28 + Trustix Secure Enterprise Linux 2.0 + Trustix Secure Linux 2.2 + Trustix Secure Linux 2.1 + Trustix Secure Linux 2.0 Linux kernel 2.4.27 -pre5Linux kernel 2.4.27 -pre4Linux kernel 2.4.27 -pre3Linux kernel 2.4.27 -pre2Linux kernel 2.4.27 -pre1Linux kernel 2.4.27 Linux kernel 2.4.26 Linux kernel 2.4.25 Linux kernel 2.4.24 -ow1Linux kernel 2.4.24 Linux kernel 2.4.23 -pre9Linux kernel 2.4.23 -ow2Linux kernel 2.4.23 + Trustix Secure Linux 2.0 Linux kernel 2.4.22 + Devil-Linux Devil-Linux 1.0.5 + Devil-Linux Devil-Linux 1.0.4 + Mandriva Linux Mandrake 9.2  amd64+ Mandriva Linux Mandrake 9.2 + Red Hat Fedora  Core1+ Slackware Linux 9.1 Linux kernel 2.4.21 pre7Linux kernel 2.4.21 pre4Linux kernel 2.4.21 pre1Linux kernel 2.4.21 + Conectiva Linux 9.0 + Mandriva Linux Mandrake 9.1 ppc+ Mandriva Linux Mandrake 9.1 + Red Hat Enterprise Linux AS  3+ RedHat Desktop 3.0 + RedHat Enterprise Linux ES  3+ RedHat Enterprise Linux WS  3+ S.u.S.E. Linux Personal 9.0 x86_64+ S.u.S.E. Linux Personal 9.0 + SuSE SUSE Linux Enterprise Server  8Linux kernel 2.4.20 Linux kernel 2.4.19 -pre6Linux kernel 2.4.19 -pre5Linux kernel 2.4.19 -pre4Linux kernel 2.4.19 -pre3Linux kernel 2.4.19 -pre2Linux kernel 2.4.19 -pre1Linux kernel 2.4.19 + Conectiva Linux 8.0 + Conectiva Linux Enterprise Edition 1.0 + MandrakeSoft Corporate Server 2.1  x86_64+ MandrakeSoft Corporate Server 2.1 + MandrakeSoft Multi Network Firewall 2.0 + Mandriva Linux Mandrake 9.0 + S.u.S.E. Linux 8.1 + Slackware Linux  -current+ SuSE SUSE Linux Enterprise Server  8+ SuSE SUSE Linux Enterprise Server  7Linux kernel 2.4.18 pre-8Linux kernel 2.4.18 pre-7Linux kernel 2.4.18 pre-6Linux kernel 2.4.18 pre-5Linux kernel 2.4.18 pre-4Linux kernel 2.4.18 pre-3Linux kernel 2.4.18 pre-2Linux kernel 2.4.18 pre-1Linux kernel 2.4.18  x86Linux kernel 2.4.18 + Astaro Security Linux 2.0 23+ Astaro Security Linux 2.0 16+ Debian Linux 3.0  sparc+ Debian Linux 3.0  s/390+ Debian Linux 3.0  ppc+ Debian Linux 3.0  mipsel+ Debian Linux 3.0  mips+ Debian Linux 3.0  m68k+ Debian Linux 3.0  ia-64+ Debian Linux 3.0  ia-32+ Debian Linux 3.0  hppa+ Debian Linux 3.0  arm+ Debian Linux 3.0  alpha+ Mandriva Linux Mandrake 8.2 + Mandriva Linux Mandrake 8.1 + Mandriva Linux Mandrake 8.0 + Red Hat Enterprise Linux AS  2.1 IA64+ RedHat Advanced Workstation for the Itanium Processor 2.1 IA64+ RedHat Advanced Workstation for the Itanium Processor 2.1 + RedHat Linux 8.0 + RedHat Linux 7.3 + S.u.S.E. Linux 8.1 + S.u.S.E. Linux 8.0 + S.u.S.E. Linux 7.3 + S.u.S.E. Linux 7.2 + S.u.S.E. Linux 7.1 + S.u.S.E. Linux Connectivity Server  + S.u.S.E. Linux Database Server  0+ S.u.S.E. Linux Firewall on CD  + S.u.S.E. Linux Office Server  + S.u.S.E. Linux Openexchange Server  + S.u.S.E. Linux Personal 8.2 + S.u.S.E. SuSE eMail Server 3.1 + S.u.S.E. SuSE eMail Server III  + SuSE SUSE Linux Enterprise Server  8+ SuSE SUSE Linux Enterprise Server  7+ Turbolinux Turbolinux Server 8.0 + Turbolinux Turbolinux Server 7.0 + Turbolinux Turbolinux Workstation 8.0 + Turbolinux Turbolinux Workstation 7.0 Linux kernel 2.4.17 Linux kernel 2.4.16 Linux kernel 2.4.15 Linux kernel 2.4.14 Linux kernel 2.4.13 + Caldera OpenLinux Server 3.1.1 + Caldera OpenLinux Workstation 3.1.1 Linux kernel 2.4.12 + Conectiva Linux 7.0 Linux kernel 2.4.11 Linux kernel 2.4.10 Linux kernel 2.4.9 + Red Hat Enterprise Linux AS  2.1 IA64+ Red Hat Enterprise Linux AS  2.1+ RedHat Enterprise Linux ES  2.1 IA64+ RedHat Enterprise Linux ES  2.1+ RedHat Enterprise Linux WS  2.1 IA64+ RedHat Enterprise Linux WS  2.1+ RedHat Linux 7.2  ia64+ RedHat Linux 7.2  i386+ RedHat Linux 7.2  alpha+ RedHat Linux 7.1  ia64+ RedHat Linux 7.1  i386+ RedHat Linux 7.1  alpha+ Sun Linux 5.0.5 + Sun Linux 5.0.3 + Sun Linux 5.0 Linux kernel 2.4.8 + Mandriva Linux Mandrake 8.2 + Mandriva Linux Mandrake 8.1 + Mandriva Linux Mandrake 8.0 Linux kernel 2.4.7 + RedHat Linux 7.2 + S.u.S.E. Linux 7.2 + S.u.S.E. Linux 7.1 Linux kernel 2.4.6 Linux kernel 2.4.5 + Slackware Linux 8.0 Linux kernel 2.4.4 + S.u.S.E. Linux 7.2 Linux kernel 2.4.3 + Mandriva Linux Mandrake 8.0  ppc+ Mandriva Linux Mandrake 8.0 Linux kernel 2.4.2 Linux kernel 2.4.1 Linux kernel 2.4 .0-test9Linux kernel 2.4 .0-test8Linux kernel 2.4 .0-test7Linux kernel 2.4 .0-test6Linux kernel 2.4 .0-test5Linux kernel 2.4 .0-test4Linux kernel 2.4 .0-test3Linux kernel 2.4 .0-test2Linux kernel 2.4 .0-test12Linux kernel 2.4 .0-test11Linux kernel 2.4 .0-test10Linux kernel 2.4 .0-test1Linux kernel 2.4 Debian Linux 3.1  sparcDebian Linux 3.1  s/390Debian Linux 3.1  ppcDebian Linux 3.1  mipselDebian Linux 3.1  mipsDebian Linux 3.1  m68kDebian Linux 3.1  ia-64Debian Linux 3.1  ia-32Debian Linux 3.1  hppaDebian Linux 3.1  armDebian Linux 3.1  amd64Debian Linux 3.1  alphaDebian Linux 3.1 Debian Linux 3.0  sparcDebian Linux 3.0  s/390Debian Linux 3.0  ppcDebian Linux 3.0  mipselDebian Linux 3.0  mipsDebian Linux 3.0  m68kDebian Linux 3.0  ia-64Debian Linux 3.0  ia-32Debian Linux 3.0  hppaDebian Linux 3.0  armDebian Linux 3.0  alphaDebian Linux 3.0'

我应该解析这个,其中 + 是一个子字符串。

最佳答案

我解决了您试图解决的问题,并通过了一篇示例 SecurityFocus Bid 文章(在本例中为 securityfocus.com/bid/20959)。这里的想法是使用像 BeautifulSoup 这样的抓取工具从网页中提取文本。然后可以解析该文本,将信息转换为 JSON 对象,然后将其转储到文件中。 SecurityFocus 上的 TexInfo 文件中的信息包含单个标签内的所有易受攻击的操作系统列表。操作系统风格的相关内核(例如 SuSE Linux 8.0)出现在其下方,并且前面带有 + 符号(例如 + Linux Kernel 2.4.5)。 + 符号实际上不是一个简单的 + 符号,而是类似于 \n\t\t\t\t\t\t\t\t+。这使得在将字符串转换为 JSON 之前必须对其进行处理。下面的代码片段对 url securityfocus.com/bid/20959 执行此任务.

from bs4 import BeautifulSoup
import urllib2
import json

response = urllib2.urlopen(r'http://www.securityfocus.com/bid/20959')
html = response.read()
soup = BeautifulSoup(html)
div_element = soup.find(id="vulnerability")
tr_element = div_element.find_all(valign="top")

td_elements =  tr_element[1].find_all("td")

os_names_list = []
for os_name in td_elements[1].stripped_strings:
    os_names_list.append(os_name)

related_kernel_indices = []
[related_kernel_indices.append(i) for i in range(0,len(os_names_list)) if os_names_list[i].startswith('+')]
for i in range(0,len(related_kernel_indices)):
    os_names_list[related_kernel_indices[i]] = os_names_list[related_kernel_indices[i] - i - 1] + '-' + " ".join(os_names_list[related_kernel_indices[i]].split()[1:])


#loop through the modified list and create a dictionary of OS names along with the correspoding kernel relations
vulnerability_os_mapping = {}

for os_name_entry in os_names_list:
    related_kernels = []
    os_name_components = os_name_entry.split('-')
    if not vulnerability_os_mapping.has_key(os_name_components[0]):
        vulnerability_os_mapping[os_name_components[0]] = related_kernels
    elif len(os_name_components) > 1:
        vulnerability_os_mapping[os_name_components[0]].append(os_name_components[1])

#create a file with a template name - vulnerability_list_<bid_id>.json
vulnerability_list_file = open('vulnerability_list_20959.json','w')
json.dump(vulnerability_os_mapping, vulnerability_list_file)

我希望这能让您了解如何执行此类任务。

关于python - 在python中的json文件中存储具有子列表的字符串列表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18459117/

相关文章:

json - 带有用于仪表板主体的嵌入式 JSON 字符串的 AWS Cloudformation YAML 文件

javascript - 如果ajax post失败,如何显示错误消息?

python - 导入错误 : cannot import name COMError in python

python - 仅获取 numpy 数组中每一行的特定列

python - 在 Ipython 中设置新主题时遇到问题?

python - numpy跨维度过滤特定值

python - 在 python 中终止程序

javascript - JSON 数据日志并发送到服务器

Python QT findChildren 未从 UI 文件中找到任何子项

python - 是否有内置函数可以执行 numpy.fromstring 的相反操作?