python - 使用 xpath/lxml 抓取文本

标签 python xpath web-scraping beautifulsoup lxml

我正在尝试从 http://www.cellartracker.com/wine.asp?iWine=91411 的“drink between: 2005 - 2013”文本中抓取文本“2005-2013”使用 xpath/lxml，我只能为这个网站上的其他一些页面做这件事，而不是这个。不确定我做错了什么/如果我从元素复制的 xpath 不正确

它告诉我:

print(content_divs[0].text_content().strip())
IndexError: list index out of range

这是我的代码:

import requests, lxml.html
page = requests.get('http://www.cellartracker.com/wine.asp?iWine=91411')
html = lxml.html.fromstring(page.content)
content_divs = html.xpath('//*[@id="wine_copy_inner"]/p/a[4]')
print(content_divs[0].text_content().strip())

感谢您的帮助!!!

最佳答案

如果你想得到 "2005 - 2013" 你可以使用下面的代码

content = html.xpath('//a[@title="Source: Community"]/text()')

关于python - 使用 xpath/lxml 抓取文本，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45336668/

上一篇：python - Dask:DataFrame 永远计算

下一篇：python - 从十六进制字符串中删除 0x

javascript - 使用 goutte 抓取数据属性？

Python子进程shell 'while loop'

python - 从 python 的嵌套数字列表中获取第一个偶数

python - 按下随机键时 cv2.waitKey(0) 不等待 - OpenCV 3.1.0、Python3、Ubuntu

Javascript 可折叠列表和 xsl

python - 时间序列中的停止绑定(bind) pandas 索引

xml - XSLT，在变量值中使用 + 字符

python - 如何循环遍历标签并重定向以检索更多标签？

java - 从 jsoup 解析中省略链接、广告等