python - 如何转义 xpath 中的正斜杠？

如何转义 xpath 查询中的正斜杠字符？我的标签包含一个网址，因此我需要能够执行此操作。我在 python 中使用 lxml。

或者，xpath 是否可以查询路径的子字符串？示例如下:

xml="""
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:gsa="http://schemas.google.com/gsa/2007">
  <gsa:content name="reportName">bbb</gsa:content>
  <gsa:content name="collectionName">default_collection</gsa:content>
  <gsa:content name="reportDate">date_3_25_2009</gsa:content>
 </entry>
"""

当我运行以下命令时:

tree=fromstring(xml)
for elt in tree.xpath('//*'):
    elt.tag

它返回:

'{http://www.w3.org/2005/Atom}entry'
'{http://schemas.google.com/gsa/2007}content'
'{http://schemas.google.com/gsa/2007}content'
'{http://schemas.google.com/gsa/2007}content'

运行 tree.xpath('/entry') 返回一个空列表。

我需要能够查询“{http://www.w3.org/2005/Atom}entry”作为标签，或者查询标签中任意位置的“entry”。

最佳答案

查看namespace prefixes^[docs] .

如果您想要 http://schemas.google.com/gsa/2007 命名空间中的元素，您需要像这样搜索它:

import lxml.etree as et

xml="""
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:gsa="http://schemas.google.com/gsa/2007">
  <gsa:content name="reportName">bbb</gsa:content>
  <gsa:content name="collectionName">default_collection</gsa:content>
  <gsa:content name="reportDate">date_3_25_2009</gsa:content>
 </entry>
"""

NS = {'rootns': 'http://www.w3.org/2005/Atom',
      'gsa': 'http://schemas.google.com/gsa/2007'}

tree = et.fromstring(xml)

for el in tree.xpath('//gsa:content', namespaces=NS):
    print el.attrib['name']

print len(tree.xpath('//rootns:entry', namespaces=NS))

关于python - 如何转义 xpath 中的正斜杠？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/7980841/

python - 如何转义 xpath 中的正斜杠？

上一篇：python - py3k : Maximum Number In Given List - short form

下一篇：python - 创建可以跨流程比较的变量