python - 在 Python 3 中使用 XPath 解析 XML

标签 python xml

我有以下 xml:

<document>
  <internal-code code="201">
    <internal-desc>Biscuits Wrapped</internal-desc>
    <top-grouping>Finished</top-grouping>
    <web-category>Biscuits</web-category>
    <web-sub-category>Biscuits (Wrapped)</web-sub-category>
  </internal-code>
  <internal-code code="202">
    <internal-desc>Biscuits Sweet</internal-desc>
    <top-grouping>Finished</top-grouping>
    <web-category>Biscuits</web-category>
    <web-sub-category>Biscuits (Sweets)</web-sub-category>
  </internal-code>
  <internal-code code="221">
    <internal-desc>Biscuits Savoury</internal-desc>
    <top-grouping>Finished</top-grouping>
    <web-category>Biscuits</web-category>
    <web-sub-category>Biscuits For Cheese</web-sub-category>
  </internal-code>
  ....
</document>

我已经使用这段代码将它加载到树中:

try:
  groups = etree.parse(PRODUCT_GROUPS_XML_FILEPATH)
  root = groups.getroot()
  internalGroup = root.findall("./internal-code")
  LOG.append("[INFO] product groupings file loaded and parsed ok")
except Exception as e:
  LOG.append("[ERROR] PRODUCT GROUPINGS XML FILE ACCESS PROBLEM")
  LOG.append("[***TERMINATED***]")
  writelog()
  exit()

我想使用 XPath 找到正确的然后能够访问该组的子节点。因此,如果我正在搜索内部代码 221 并想要网络类别,我会执行如下操作:

internalGroup.find("internal-code", 221).get("web-category").text

我没有使用 XML 和 Python 的经验,而且我已经盯着这个看很久了。非常感谢收到所有帮助。谢谢

最佳答案

根据 xml.etree.ElementTree文档:

XPath support

This module provides limited support for XPath expressions for locating elements in a tree. The goal is to support a small subset of the abbreviated syntax; a full XPath engine is outside the scope of the module.

使用lxml :

>>> import lxml.etree as ET
>>>
>>> s = '''
... <document>
...   <internal-code code="201">
...     <internal-desc>Biscuits Wrapped</internal-desc>
...     <top-grouping>Finished</top-grouping>
...     <web-category>Biscuits</web-category>
...     <web-sub-category>Biscuits (Wrapped)</web-sub-category>
...   </internal-code>
...   <internal-code code="202">
...     <internal-desc>Biscuits Sweet</internal-desc>
...     <top-grouping>Finished</top-grouping>
...     <web-category>Biscuits</web-category>
...     <web-sub-category>Biscuits (Sweets)</web-sub-category>
...   </internal-code>
...   <internal-code code="221">
...     <internal-desc>Biscuits Savoury</internal-desc>
...     <top-grouping>Finished</top-grouping>
...     <web-category>Biscuits</web-category>
...     <web-sub-category>Biscuits For Cheese</web-sub-category>
...   </internal-code>
... </document>
... '''
>>>
>>> root = ET.fromstring(s)
>>> for text in root.xpath('.//internal-code[@code="221"]/web-category/text()'):
...     print(text)
...
Biscuits

关于python - 在 Python 3 中使用 XPath 解析 XML,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21628290/

相关文章:

python - Opencv相机崩溃,退出代码-1073741819(0xC0000005)

python - 绘制与曲面相交的 3d 线

ruby - 如何解析无效的 XML

android - 在 Android 中从 res/xml 打开 XML 文件

xml - 无效的 XML 文件?

python - mount 返回非零退出代码 64

python - Plotly:在热图中穿过单元格中间的形状线

python - python中复数的平方根

sql-server - 批量插入带有外键的嵌套 xml 作为第一个表的标识列

xml - 为什么 eclipse 在编辑器中加载 XML 时闪烁和缓慢?