问题描述
|
我希望能够转储包含我希望以块样式包含的长字符串的字典以提高可读性。例如:
foo: |
this is a
block literal
bar: >
this is a
folded block
PyYAML支持这种样式的文档加载,但是我似乎找不到找到这种方式转储文档的方法。我想念什么吗?
解决方法
import yaml
class folded_unicode(unicode): pass
class literal_unicode(unicode): pass
def folded_unicode_representer(dumper,data):
return dumper.represent_scalar(u\'tag:yaml.org,2002:str\',data,style=\'>\')
def literal_unicode_representer(dumper,style=\'|\')
yaml.add_representer(folded_unicode,folded_unicode_representer)
yaml.add_representer(literal_unicode,literal_unicode_representer)
data = {
\'literal\':literal_unicode(
u\'by hjw ___\\n\'
\' __ /.-.\\\\\\n\'
\' / )_____________\\\\\\\\ Y\\n\'
\' /_ /=== == === === =\\\\ _\\\\_\\n\'
\'( /)=== == === === == Y \\\\\\n\'
\' `-------------------( o )\\n\'
\' \\\\___/\\n\'),\'folded\': folded_unicode(
u\'It removes all ordinary curses from all equipped items. \'
\'Heavy or permanent curses are unaffected.\\n\')}
print yaml.dump(data)
结果:
folded: >
It removes all ordinary curses from all equipped items. Heavy or permanent curses
are unaffected.
literal: |
by hjw ___
__ /.-.\\
/ )_____________\\\\ Y
/_ /=== == === === =\\ _\\_
( /)=== == === === == Y \\
`-------------------( o )
\\___/
为了完整起见,还应该有str实现,但是我会很懒惰:-)
, pyyaml
支持转储字面量或折叠的块。
使用Representer.add_representer
定义类型:
class folded_str(str): pass
class literal_str(str): pass
class folded_unicode(unicode): pass
class literal_unicode(str): pass
然后,您可以定义这些类型的表示形式。
请注意,尽管Gary \的解决方案非常适合unicode,但您可能还需要做更多的工作才能使字符串正常工作(请参见describe_str的实现)。
def change_style(style,representer):
def new_representer(dumper,data):
scalar = representer(dumper,data)
scalar.style = style
return scalar
return new_representer
import yaml
from yaml.representer import SafeRepresenter
# represent_str does handle some corner cases,so use that
# instead of calling represent_scalar directly
represent_folded_str = change_style(\'>\',SafeRepresenter.represent_str)
represent_literal_str = change_style(\'|\',SafeRepresenter.represent_str)
represent_folded_unicode = change_style(\'>\',SafeRepresenter.represent_unicode)
represent_literal_unicode = change_style(\'|\',SafeRepresenter.represent_unicode)
然后,您可以将这些表示符添加到默认的转储程序中:
yaml.add_representer(folded_str,represent_folded_str)
yaml.add_representer(literal_str,represent_literal_str)
yaml.add_representer(folded_unicode,represent_folded_unicode)
yaml.add_representer(literal_unicode,represent_literal_unicode)
...并对其进行测试:
data = {
\'foo\': literal_str(\'this is a\\nblock literal\'),\'bar\': folded_unicode(\'this is a folded block\'),}
print yaml.dump(data)
结果:
bar: >-
this is a folded block
foo: |-
this is a
block literal
使用default_style
如果您希望所有字符串都遵循默认样式,也可以使用default_style
关键字参数,例如:
>>> data = { \'foo\': \'line1\\nline2\\nline3\' }
>>> print yaml.dump(data,default_style=\'|\')
\"foo\": |-
line1
line2
line3
或折叠文字:
>>> print yaml.dump(data,default_style=\'>\')
\"foo\": >-
line1
line2
line3
或双引号文字:
>>> print yaml.dump(data,default_style=\'\"\')
\"foo\": \"line1\\nline2\\nline3\"
注意事项:
这是您可能不会想到的示例:
data = {
\'foo\': literal_str(\'this is a\\nblock literal\'),\'non-printable\': literal_unicode(\'this has a \\t tab in it\'),\'leading\': literal_unicode(\' with leading white spaces\'),\'trailing\': literal_unicode(\'with trailing white spaces \'),}
print yaml.dump(data)
结果是:
bar: >-
this is a folded block
foo: |-
this is a
block literal
leading: |2-
with leading white spaces
non-printable: \"this has a \\t tab in it\"
trailing: \"with trailing white spaces \"
1)不可打印的字符
有关转义字符,请参见YAML规范(第5.7节):
请注意,转义序列仅以双引号标量解释。在所有其他标量样式中,“ \\”字符没有特殊含义,并且不可打印字符不可用。
如果要保留不可打印的字符(例如TAB),则需要使用双引号标量。如果您能够转储具有文字样式的标量,并且其中存在不可打印的字符(例如TAB),则说明您的YAML转储器不兼容。
例如。即使指定了默认样式,3ѭ也会检测不可打印字符character18ѭ并使用双引号样式:
>>> data = { \'foo\': \'line1\\nline2\\n\\tline3\' }
>>> print yaml.dump(data,default_style=\'\"\')
\"foo\": \"line1\\nline2\\n\\tline3\"
>>> print yaml.dump(data,default_style=\'>\')
\"foo\": \"line1\\nline2\\n\\tline3\"
>>> print yaml.dump(data,default_style=\'|\')
\"foo\": \"line1\\nline2\\n\\tline3\"
2)前导和尾随空格
规范中的另一有用信息是:
内容中不包括所有前导和尾随空格字符
这意味着,如果您的字符串确实有前导或尾随空格,则这些字符串将不会以标量样式保留,而不能使用双引号。结果,ѭ3试图检测标量中的内容,并可能强制使用双引号样式。
, 这可以相对轻松地完成,唯一的“障碍”是如何
指出字符串中哪个空格需要
表示为折叠的标量,需要变成折叠。文字标量
有包含该信息的显式换行符,但这不能
用于折叠标量,因为它们可以包含显式换行符,例如在
情况下有领先的空格,并且最后也需要换行
为了不被剥皮的砍伐指示器(>-
)所代表
import sys
import ruamel.yaml
folded = ruamel.yaml.scalarstring.FoldedScalarString
literal = ruamel.yaml.scalarstring.LiteralScalarString
yaml = ruamel.yaml.YAML()
data = dict(
foo=literal(\'this is a\\nblock literal\\n\'),bar=folded(\'this is a folded block\\n\'),)
data[\'bar\'].fold_pos = [data[\'bar\'].index(\' folded\')]
yaml.dump(data,sys.stdout)
这使:
foo: |
this is a
block literal
bar: >
this is a
folded block
fold_pos
属性要求可逆的可迭代值,表示位置
指示折叠位置的空间。
如果您的字符串中从来没有竖线字符(\'| \'),
可以做类似的事情:
import re
s = \'this is a|folded block\\n\'
sf = folded(s.replace(\'|\',\' \')) # need to have a space!
sf.fold_pos = [x.start() for x in re.finditer(\'\\|\',s)] # | is special in re,needs escaping
data = dict(
foo=literal(\'this is a\\nblock literal\\n\'),bar=sf,# need to have a space
)
yaml = ruamel.yaml.YAML()
yaml.dump(data,sys.stdout)
这也可以准确给出您期望的输出