html - 如何将 Scrapy XPath 与 XML 命名空间一起使用？

如何提取 <content:encoded> ... </content:encoded>使用来自 RSS feed 的 scrapy XPath 的内容(下面的示例)？

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
     xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Latest &#8211; Reason.com</title>
    <item>
        <pubDate>Thu, 16 Jan 2020 21:40:23 +0000</pubDate>
        <content:encoded><![CDATA[<p><span style="font-weight: 400">
          Jimmy Meders was scheduled to die by lethal injection today, 
          but the Georgia parole board has granted him clemency.</span></p>]]> 
        </content:encoded>
...

我试过了 response.xpath('//content:encoded').get() , 但它不起作用。

非常感谢任何帮助。

最佳答案

您必须声明并注册一个 XML 命名空间前缀:

response.selector.register_namespace('content', 
                                     'http://purl.org/rss/1.0/modules/content/')
response.xpath('//content:encoded').getall()

文档: register_namespace()

关于html - 如何将 Scrapy XPath 与 XML 命名空间一起使用？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59777994/

上一篇：elisp - 我应该在我的 Emacs Lisp 包中的什么地方添加自动加载 cookie？有权威的指南吗？

下一篇：mySQL case when - 如果条件不互斥怎么办？

相关文章：

javascript - 如何为同一个类定义的元素添加不同的样式？

PHP命名空间simplexml问题

testing - 关于出于测试目的存在于类中的 Watir 问题

python - 提取属性值,Lxml

html - 如何使用 bootstrap css 制作导航项？

html - Bootstrap 卡片组是透明的，无法正常工作

html - 如何阻止移动 CSS 被主 CSS 覆盖

c# - 如何从嵌套的 xml 节点获取属性值？

Java XML : parsing nested XML file with identical tags

xml - Jmeter- 如何从 HTTP 采样器的响应中读取 xml 属性值