class - 使用lxml查找带有类的html元素

我到处搜索，我找到的最多的是 doc.xpath('//element[@class="classname"]')，但是无论我尝试什么，这都不起作用。

我正在使用的代码

import lxml.html

def check():
    data = urlopen('url').read();
    return str(data);

doc = lxml.html.document_fromstring(check())
el = doc.xpath("//div[@class='test']")
print(el)

它只是打印一个空列表。

编辑:
多么奇怪。我使用谷歌作为测试页面，它在那里工作正常，但它在我使用的页面上不起作用(youtube)

这是我正在使用的确切代码。

import lxml.html
from urllib.request import urlopen
import sys

def check():
    data = urlopen('http://www.youtube.com/user/TopGear').read(); #TopGear as a test
    return data.decode('utf-8', 'ignore');


doc = lxml.html.document_fromstring(check())
el = doc.xpath("//div[@class='channel']")
print(el)

最佳答案

您用于测试的 TopGear 页面没有任何 <div class="channel">元素。但这有效(例如):

el = doc.xpath("//div[@class='channel-title-container']")

或这个:

el = doc.xpath("//div[@class='a yb xr']")

找 <div>带有 class 的元素包含字符串 channel 的属性，你可以使用

el = doc.xpath("//div[contains(@class, 'channel')]")

关于class - 使用lxml查找带有类的html元素，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/8226490/

上一篇：r - ggplot2:如何调整箱线图中的填充颜色(并更改图例文本)？

下一篇：entity-framework - EF4.1 代码优先 : Stored Procedure with output parameter

相关文章：

c++ - 删除指向不完整类型的指针 'Point' ；没有调用析构函数

python - 如何找到列表中按特定顺序排列并包含字母的最高字符？

python - 关于用 python 抓取 html 的说明

python - 使用 Python lxml 连续写入输出文件

c++ - 模板类类型特定的函数

objective-c - 我可以在子类中实现协议(protocol)的一个功能吗？

Python类继承——Base被子类修改

python - xlwings打开错误: not opening excel workbook getting an error upon call wb.

python - 如何用一个字符替换多个空格？

python - 使用多行属性解析 XML