python - 递归搜索父子组合并在 python 和 XML 中构建树

标签 python xml xml-parsing hierarchical-data

我正在尝试遍历这个充满父->子关系的 XML 数据,并且需要一种构建树的方法。任何帮助将不胜感激。另外,在这种情况下,为父-->子关系提供属性或节点更好吗?

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<nodes>
    <node name="Car" child="Engine"/>
    <node name="Car" child="Wheel"/>
    <node name="Engine" child="Piston"/>
    <node name="Engine" child="Carb"/>
    <node name="Carb" child="Bolt"/>
    <node name="Spare Wheel"/>
    <node name="Bolt" child="Thread"/>
    <node name="Carb" child="Foat"/>
    <node name="Truck" child="Engine"/>
    <node name="Engine" child="Bolt"/>
    <node name="Wheel" child="Hubcap"/>
</nodes>

在 Python 脚本上,这就是我所拥有的。我的大脑被炸了,我无法理解逻辑?请帮忙

import xml.etree.ElementTree as ET
tree = ET.parse('rec.xml')
root = tree.getroot()
def find_node(data,search):
    #str = root.find('.//node[@child="1.2.1"]')
    for node in data.findall('.//node'):
        if node.attrib['name']==search:
            print('Child-->', node)

for nodes in root.findall('node'):
    parent = nodes.attrib.get('name')
    child = nodes.attrib.get('child')
    print (parent,'-->', child)
    find_node(root,child)

预期的可能输出是这样的(真的不关心排序顺序,只要所有节点项都在树中的某处表示即可。

Car --> Engine --> Piston
Car --> Engine --> Carb --> Float
Car --> Engine --> Carb --> Bolt --> Thread
Car --> Wheel --> Hubcaps
Truck --> Engine --> Piston
Truck --> Engine --> Carb --> Bolt --> Thread
Truck --> Loading Bin
Spare Wheel -->

最佳答案

我已经有很长时间没有用图表做过任何事情了,但这应该非常接近,它不是最佳方法:

x = """<?xml version="1.0"?>
<nodes>
    <node name="Car" child="Engine"></node>
    <node name="Engine" child="Piston"></node>
    <node name="Engine" child="Carb"></node>
    <node name="Car" child="Wheel"></node>
    <node name="Wheel" child="Hubcaps"></node>
    <node name="Truck" child="Engine"></node>
    <node name="Truck" child="Loading Bin"></node>
    <nested>
        <node name="Spare Wheel" child="Engine"></node>
    </nested>
    <node name="Spare Wheel" child=""></node>

</nodes>"""

from lxml import etree

xml = etree.fromstring(x)
graph = {}
nodes = set()
for x in xml.xpath("//node"):
    par, child = x.xpath(".//@name")[0], x.xpath(".//@child")[0]
    graph.setdefault(par, set())
    graph[par].add(child)
    nodes.update([child, par])


def find_all_paths(graph, start, end, path=None):
    if path is None:
        path = []
    path = path + [start]
    if start == end:
        yield path
    for node in graph.get(start, []):
        if node not in path:
            for new_path in find_all_paths(graph, node, end, path):
                yield new_path


for n in graph:
    for e in nodes:
        if n != e:
            for path in find_all_paths(graph, n, e):
                if path:
                    print("--> ".join(path))

更新后的输入会给你:

Engine--> Carb
Engine--> Piston
Car--> Engine
Car--> Wheel
Car--> Wheel--> Hubcaps
Car--> Engine--> Carb
Car--> Engine--> Piston
Spare Wheel--> Engine
Spare Wheel--> 
Spare Wheel--> Engine--> Carb
Spare Wheel--> Engine--> Piston
Wheel--> Hubcaps
Truck--> Engine
Truck--> Engine--> Carb
Truck--> Engine--> Piston
Truck--> Loading Bin

关于python - 递归搜索父子组合并在 python 和 XML 中构建树,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37170543/

相关文章:

c# - 如何处理多个xml标准?

python - 具有错误检查功能的django模板引擎?

xml - MSXML2.DOMDocument60 使 Excel 崩溃

java - 使用带有 namespace 和模式的 Jaxb 进行 XML 解码

android - 如何确定应用程序是在平板电脑还是移动设备上运行?

android - 我想在 WebView 中显示一个 XML 标签 <Cat_Desc> 但它也有 HTML 标签

python - 使用 Python 读取大型二进制文件的最快方法

python - 如何检查数组中是否可以求和?

python - 使用自定义详细名称覆盖 Django 管理中的 list_display

java - Simplexml valuerequired 异常