string - 如何格式化 YAML 转储中的字符串?

标签 string formatting dump ruamel.yaml

使用 ruamel.yaml 转储多行字符串会产生以下结果:

address_pattern_template: "\n^                           #the beginning of the address\
  \ string (e.g. interface number)\n(?P<junkbefore>             #capturing the junk\
  \ before the address\n    \\D?                     #an optional non-digit character\n\
  \    .*?                     #any characters (non-greedy) up to the address\n)\n\
  (?P<address>                #capturing the pure address\n    {pure_address_pattern}\n\
  )\n(?P<junkafter>              #capturing the junk after the address\n    \\D? \
  \                    #an optional non-digit character\n    .*                  \
  \    #any characters (greedy) up to the end of the string\n)\n$                \
  \           #the end of the input address string\n"

代码是这样的:

from ruamel.yaml import YAML
data =dict(
address_pattern_template="""
^                           #the beginning of the address string (e.g. interface number)
(?P<junkbefore>             #capturing the junk before the address
    \D?                     #an optional non-digit character
    .*?                     #any characters (non-greedy) up to the address
)
(?P<address>                #capturing the pure address
    {pure_address_pattern}
)
(?P<junkafter>              #capturing the junk after the address
    \D?                     #an optional non-digit character
    .*                      #any characters (greedy) up to the end of the string
)
$                           #the end of the input address string
"""
)
yaml = YAML(typ='safe', pure=True)
yaml.default_flow_style = False
with open('D:\datadump.yml', 'w') as dumpfile:
    yaml.dump(data, dumpfile)

我想以可读的格式查看多行字符串。 IE。换行符用于换行而不是显示为“\n”。

我可以设置哪些标志/选项,使其显示如下:

address_pattern_template: |
  ^                           #the beginning of the address string (e.g. interface number)
  (?P<junkbefore>             #capturing the junk before the address
      \D?                     #an optional non-digit character
      .*?                     #any characters (non-greedy) up to the address
  )
  (?P<address>                #capturing the pure address
      {pure_address_pattern}
  )
  (?P<junkafter>              #capturing the junk after the address
      \D?                     #an optional non-digit character
      .*                      #any characters (greedy) up to the end of the string
  )
  $                           #the end of the input address string

注意,我的程序记录了一个大字典,这样的多行字符串可以出现在字典结构中的任何位置和任何深度。因此,遍历字典树并在转储之前加载每个树(如“我可以控制多行字符串的格式吗?”中所建议的)对我来说不是一个好的解决方案。

我想知道是否可以使用参数来指示转储程序来识别多行字符串并将它们转储为 block 格式。单行字符串仍然可以与冒号位于同一行。这使得日志文件最具可读性。

最佳答案

首先,你所呈现的内容就是你想要得到的输出, 不代表您提供的数据。自从 该数据中的多行字符串以换行符开头, block 样式文字标量需要 block 缩进指示器和开头的换行符:

address_pattern_template: |2

  ^                           #the beginning of the address string (e.g. interface number)
  .
  .
  .

但拥有这些模式没有意义(至少对我来说) 以换行符开头,因此我将在下面省略它。


如果您不知道多行字符串在数据结构中的位置,但如果可以 在转储之前就地转换它,您可以使用 ruamel.yaml.scalarstring:walk_tree

import sys
import ruamel.yaml

data = dict(a=[1, 2, 3, dict(
address_pattern_template="""\
^                           #the beginning of the address string (e.g. interface number)
(?P<junkbefore>             #capturing the junk before the address
    \D?                     #an optional non-digit character
    .*?                     #any characters (non-greedy) up to the address
)
(?P<address>                #capturing the pure address
    {pure_address_pattern}
)
(?P<junkafter>              #capturing the junk after the address
    \D?                     #an optional non-digit character
    .*                      #any characters (greedy) up to the end of the string
)
$                           #the end of the input address string
"""
)])


yaml = ruamel.yaml.YAML()
ruamel.yaml.scalarstring.walk_tree(data)
yaml.dump(data, sys.stdout)

给出:

a:
- 1
- 2
- 3
- address_pattern_template: |
    ^                           #the beginning of the address string (e.g. interface number)
    (?P<junkbefore>             #capturing the junk before the address
        \D?                     #an optional non-digit character
        .*?                     #any characters (non-greedy) up to the address
    )
    (?P<address>                #capturing the pure address
        {pure_address_pattern}
    )
    (?P<junkafter>              #capturing the junk after the address
        \D?                     #an optional non-digit character
        .*                      #any characters (greedy) up to the end of the string
    )
    $                           #the end of the input address string

walk_tree 将用以下内容替换多行字符串 LiteralScalarString,在大多数情况下其行为与普通字符串类似 字符串。

如果就地转换 Not Acceptable ,您可以进行深度复制 首先数据,然后在副本上应用 walk_tree。如果不是可以接受的 由于内存限制,那么您必须为字符串提供替代表示者 在表示过程中检查是否有多行字符串。最好你这样做 在代表者的子类中:

import sys
import ruamel.yaml

# data defined as before

class MyRepresenter(ruamel.yaml.representer.RoundTripRepresenter):
    def represent_str(self, data):
        style = '|' if '\n' in data else None
        return self.represent_scalar(u'tag:yaml.org,2002:str', data, style=style)


MyRepresenter.add_representer(str, MyRepresenter.represent_str)

yaml = ruamel.yaml.YAML()
yaml.Representer = MyRepresenter
yaml.dump(data, sys.stdout)

它提供与前面的示例相同的输出。

关于string - 如何格式化 YAML 转储中的字符串?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57393974/

相关文章:

正则表达式,用于在 | 之后删除所有内容(带 | )

java - 如何在java中将当前日期转换为字符串?

excel - Pandas :to_excel() float_format

像 Windows 一样显示字节数的 Delphi 函数

mysql - 从 Wikipedia 转储文件中重新编辑 BLOB 值

Java:字符串标记器并分配给2个变量?

c# - 从 c# 中的字符串文件路径中删除额外的反斜杠 "\"

sql-server-2005 - SQL Server 插入数据的转储

arrays - Icarus verilog 转储内存数组 ($dumpvars)

jquery - 从 jQuery 中的 HTML 字符串转换时缺少 HTML 元素?