python - 如何在 Python/ElementTree 中输出 XML 声明 <?xml version ="1.0"?>

标签 python xml character-encoding elementtree

我正在尝试为 XML 格式的单词引用源文件创建一个 XML 文件。当我写入文件时,只有“xml_decaration=True”,它显示 <?xml version='1.0' encoding='us-ascii'?>但我想要它的形式 <?xml version="1.0"?> .

from xml.etree.ElementTree import ElementTree
from xml.etree.ElementTree import Element
import xml.etree.ElementTree as ET
import uuid
from lxml import etree

root=Element('b:sources')
root.set('SelectedStyle','')
root.set('xmlns:b','http://schemas.openxmlformats.org/officeDocument/2006/bibliography')
root.set('xmlns','http://schemas.openxmlformats.org/officeDocument/2006/bibliography')
#root.attrib=('SelectedStyle'='', 'xmlns:b'='"http://schemas.openxmlformats.org/officeDocument/2006/bibliography"', 'xmlns:b'='"http://schemas.openxmlformats.org/officeDocument/2006/bibliography"','xmlns'='"http://schemas.openxmlformats.org/officeDocument/2006/bibliography"')


source=ET.SubElement(root, 'b:source')
ET.SubElement(source,'b:Tag')
ET.SubElement(source,'b:SourceType').text='Misc'
ET.SubElement(source,'b:guid').text=str(uuid.uuid1())

Author=ET.SubElement(source,'b:Author')
Author2=ET.SubElement(Author,'b:Author')
ET.SubElement(Author2,'b:Corporate').text='Norsk olje og gass'

ET.SubElement(source, 'b:Title').text='R-002'
ET.SubElement(source, 'b:Year').text='2019'
ET.SubElement(source, 'b:Month').text='10'
ET.SubElement(source, 'b:Day').text='27'


tree=ElementTree(root)

tree.write('Sources.xml', xml_declaration=True, method='xml')

最佳答案

回答:

当使用 xml.etree.ElementTree 时,无法避免在声明中包含编码属性。如果您根本不想在 XML 声明中使用编码属性,则需要使用 xml.dom.minidom 而不是 xml.etree.ElementTree

这是设置示例的片段:

import xml.etree.ElementTree
a = xml.etree.ElementTree.Element('a')
tree = xml.etree.ElementTree.ElementTree(element=a)
root = tree.getroot()

省略编码:

out = xml.etree.ElementTree.tostring(root, xml_declaration=True)
b"<?xml version='1.0' encoding='us-ascii'?>\n<a />"

编码us-ascii:

out = xml.etree.ElementTree.tostring(root, encoding='us-ascii', xml_declaration=True)
b"<?xml version='1.0' encoding='us-ascii'?>\n<a />"

编码unicode:

out = xml.etree.ElementTree.tostring(root, encoding='unicode', xml_declaration=True)
"<?xml version='1.0' encoding='UTF-8'?>\n<a />"

使用minidom:

让我们以上面的第一个示例为例,省略编码并使用变量 out 作为 xml.dom.minidom 的输入,您将看到您的输出'寻求。

import xml.dom.minidom
dom = xml.dom.minidom.parseString(out)
dom.toxml()
'<?xml version="1.0" ?><a/>'

还有一个 pretty-print 选项:

dom.toprettyxml()
'<?xml version="1.0" ?>\n<a/>\n'

注意事项

查看源代码,您可以看到编码在输出中是硬编码的。

        with _get_writer(file_or_filename, encoding) as (write, declared_encoding):
            if method == "xml" and (xml_declaration or
                    (xml_declaration is None and
                     declared_encoding.lower() not in ("utf-8", "us-ascii"))):
                write("<?xml version='1.0' encoding='%s'?>\n" % (
                    declared_encoding,))

https://github.com/python/cpython/blob/550c44b89513ea96d209e2ff761302238715f082/Lib/xml/etree/ElementTree.py#L731-L736

关于python - 如何在 Python/ElementTree 中输出 XML 声明 <?xml version ="1.0"?>,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64760695/

相关文章:

python - Flower-Celery异步任务监视工具-在Flask/Docker Web API上

python - 替换数据框中的定位值

java - 返回字符串数组代替字符串

java - 从 XML 文件填充 ExpandableListView - Android

linux - 无法重命名文件名

python - 在 numpy 中使用列表和数组进行索引似乎不一致

python - while循环在django中递增范围

java - 如何将面向对象的结构映射到表单字段?

mysql - 字符编码?上传显示整页钻石的 UI 后,站点迁移到新服务器,并带有 ?里面有标记

Java URLEncode 给出不同的结果