python - 在 Python 中使用 ElementTree 更改命名空间前缀

默认情况下，当您调用 ElementTree.parse(someXMLfile) 时，Python ElementTree 库会在每个解析的节点前面加上 Clark 表示法中的命名空间 URI:

    {http://example.org/namespace/spec}mynode

This makes accessing specific nodes by name a huge pain later in the code.

I've read through the docs on ElementTree and namespaces and it looks like the iterparse() function should allow me to alter the way the parser prefixes namespaces, but for the life of me I can't actually make it change the prefix. It seems like that may happen in the background before the ns-start event even fires as in this example:

for event, elem in iterparse(source):
    if event == "start-ns":
        namespaces.append(elem)
    elif event == "end-ns":
        namespaces.pop()
    else:
        ...

如何让它改变前缀行为以及函数结束时返回的正确内容是什么？

最佳答案

您不需要特别使用 iterparse。取而代之的是以下脚本:

from cStringIO import StringIO
import xml.etree.ElementTree as ET

NS_MAP = {
    'http://www.red-dove.com/ns/abc' : 'rdc',
    'http://www.adobe.com/2006/mxml' : 'mx',
    'http://www.red-dove.com/ns/def' : 'oth',
}

DATA = '''<?xml version="1.0" encoding="utf-8"?>
<rdc:container xmlns:mx="http://www.adobe.com/2006/mxml"
                 xmlns:rdc="http://www.red-dove.com/ns/abc"
                 xmlns:oth="http://www.red-dove.com/ns/def">
  <mx:Style>
    <oth:style1/>
  </mx:Style>
  <mx:Style>
    <oth:style2/>
  </mx:Style>
  <mx:Style>
    <oth:style3/>
  </mx:Style>
</rdc:container>'''

tree = ET.parse(StringIO(DATA))
some_node = tree.getroot().getchildren()[1]
print ET.fixtag(some_node.tag, NS_MAP)
some_node = some_node.getchildren()[0]
print ET.fixtag(some_node.tag, NS_MAP)

生产

('mx:Style', None)
('oth:style2', None)

这显示了如何访问已解析树中各个节点的完全限定标签名称。您应该能够根据您的特定需求进行调整。

关于python - 在 Python 中使用 ElementTree 更改命名空间前缀，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/1249876/

python - 在 Python 中使用 ElementTree 更改命名空间前缀

上一篇：python - Python运行程序的热插拔

下一篇：python - 对于纯 numpy 代码，使用 numba 的 yield 在哪里？