python - 使用列表指定要使用的子词典

不确定我想怎么做，但基本上我有一个项目列表

section = ['messages','ProcQueueLen']

或

section = ['messages','CpuError']

...取决于我们所在的部分...

例如一些属于 procqueuelen 部分的数据点。

我想创建一个动态字典，这样我就可以将数据点(作为字典)添加到正确的字典条目中。例如:

<setup>
   logfile = cdm.log
   loglevel = 0
   cpu_usage_includes_wait = yes
   internal_alarm_message = InternalAlarm
   mem_buffer_used = no
   alarm_on_each_sample = no
   qos_source_short = yes
   trendsubject = cdm
   trendpriority = information  
   paging_in_kilobytes = yes
   post_install = 1382462705
   allow_qos_source_as_target = no
   monitor_iostat = yes
   allow_remote_disk_info = yes
</setup>
<messages>
   <ProcQueueLen>
      text = Average ($value_number samples) 
processor queue length is $value$unit, which is >= $value_limit$unit. Last value is $value_last$unit.
      level = minor
      token = proc_q_len
   </ProcQueueLen>
   <CpuError>
      text = Average ($value_number samples) total cpu is now $value$unit, which is above the error threshold ($value_limit$unit)
      level = major
      token = cpu_error
      i18n_token = as#system.cdm.avrg_total_cpu_above_err_threshold
   </CpuError>
</messages>

会产生一个嵌套的字典，如下所示:

conf = {'messages':{'ProcQueueLen':{'text':'Average ($value_number samples) processor queue length is $value$unit, which is >= $value_limit$unit. Last value is $value_last$unit.','level':'minor','token':'proc_q_len'},'CpuError':{'text':'Average ($value_number samples) total cpu is now $value$unit, which is above the error threshold ($value_limit$unit)','level':'major','token':'cpu_error','i18n_token':'as#system.cdm.avrg_total_cpu_above_err_threshold'}}}

我正在逐行读取包含这些不同部分的文件，并根据需要通过附加和弹出部分来设置条目进入的部分。但我不确定如何根据此部分列表指定嵌套字典。

这不是有效的 xml，因为它没有正确的部分并且包含无效字符。我试过 beautifulsoup 但速度很慢。通过将数据放入嵌套字典中，我可以更快更轻松地进行导航。

我目前仅有的代码如下:

conf = {}
section = []
for i, line in enumerate(out.split('\\n')):
    l = line.strip()
    if i < 20:
        print(l)
        if l.startswith('</'):
            print('skipping')
        elif l.startswith('<'):
            conf[l] = {}
            section.append(l)
            print('create dbentry')
        else:
            conf[section][l.split('=')[0].strip()] = l.split('=')[1].strip()
            print('add to dbentry')

这不起作用，因为在这种情况下 [section] 需要是一个部分列表，我不确定该怎么做。

@Ajax1234 这就是我从您的解决方案中获得的结果。

print([c for c in _r if c[0]])
[['\\n   logfile', 'cdm.log\\n   loglevel', '0\\n   cpu_usage_includes_wait', 'yes\\n   internal_alarm_message', 'InternalAlarm\\n   mem_buffer_used', 'no\\n   alarm_on_each_sample', 'no\\n   qos_source_short', 'yes\\n   trendsubject', 'cdm\\n   trendpriority', 'information\\n   paging_in_kilobytes', 'yes\\n   post_install', '1382462705\\n   allow_qos_source_as_target', 'no\\n   monitor_iostat', 'yes\\n   allow_remote_disk_info', 'yes\\n']]
print(dict([c for c in _r if c[0]]))
Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec
    exec(exp, global_vars, local_vars)
  File "<input>", line 1, in <module>
ValueError: dictionary update sequence element #0 has length 15; 2 is required

最佳答案

如果您可以重新定义输入语法，我建议您使用普通的 .ini 文件并使用 Python 的 configparser。

我喜欢 Ajax 和 Serge Ballista 的回答，但如果您想修改现有代码以使其正常工作，请尝试以下操作:

import pprint
conf = {}
section = []
for i, line in enumerate(out.split('\n')):
    l = line.strip()
    if i < 20:
        l = l.strip("\n")
        if not l:
            # skip if end of file
            continue 
        if l.startswith('</'):
            # we need to remove this from the list of current sections
            section.pop()
            print('skipping')
        elif l.startswith('<'):
            sec_name = l.strip("<>")  # what you wanted was conf["messages"], not conf["<messages>"]
            secstr = "".join(f"['{x}']" for x in section)  # create a string that looks something like ['messages']['ProcQueueLen']
            correct = eval(f"conf{secstr}")  # use the string to evaluate to an actual section in your conf dict          
            correct[sec_name] = {}  # set the new section to an empty dictionary
            section.append(sec_name)  # add the new section to the dictionary route
            print(f"create dbentry: {secstr}['{sec_name}']")
        else:
            secstr = "".join(f"['{x}']" for x in section)
            correct = eval(f"conf{secstr}")
            # you have = in the middle of config values, which means that you can't split on '=', but you can split on ' = ' if your format is consistent. 
            correct[l.split(' = ')[0].strip()] = l.split(' = ')[1].strip()
            print(f"add to dbentry: {correct[l.split(' = ')[0].strip()]}")
pprint.pprint(conf)

有了这个，再加上你的输入，我得到了以下输出:

{'messages': {'CpuError': {'i18n_token': 'as#system.cdm.avrg_total_cpu_above_err_threshold',
                           'level': 'major',
                           'text': 'Average ($value_number samples) total cpu '
                                   'is now $value$unit, which is above the '
                                   'error threshold ($value_limit$unit)',
                           'token': 'cpu_error'},
              'ProcQueueLen': {'level': 'minor',
                               'text': 'Average ($value_number samples) '
                                       'processor queue length is $value$unit, '
                                       'which is >= $value_limit$unit. Last '
                                       'value is $value_last$unit.',
                               'token': 'proc_q_len'}}}

关于python - 使用列表指定要使用的子词典，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56446395/

python - 使用列表指定要使用的子词典

上一篇：python - python3 nltk word_tokenize() 有字符串长度限制吗？

下一篇：python - 合并数据框和重复值