python - 使用 BeautifulSoup 调用特定的 'div' 元素

我希望从以下 HTML 代码中提取第二个链接(即数字“2”的链接):

<div class="post-footers">
    1 |<a href="index.html?page=2"> 2 </a>
    |<a href="index.html?page=3"> 3 </a>
    |<a href="index.html?page=4"> 4 </a> 
</div>

所以我想将所有 href 输出到一个列表中，然后提取索引 1 处的元素，如下所示:

tags = soup.find("div", class_="post-footer")
links = tags.get('href')
print links[1]

但它返回错误:

newtags.get('href', None) 
AttributeError: 'NoneType' object has no attribute 'get'

这意味着标签结果是空的。那么我的代码哪里出错了？

谢谢，如果有人能提供帮助:)

最佳答案

试试这个，

尝试 1

In [1]: tags = soup.find("div", class_ = "post-footers")
In [2]: links = [i.attrs['href'] for i in tags.findAll('a')] 
In [3]: print links

结果 1

['index.html?page=2', 'index.html?page=3', 'index.html?page=4']

您的代码中存在拼写错误。您使用了 post-footer 而不是 post-footers。

尝试 2

如果您将 href 用作 True，您将得到所有这样的 a，

In [28]: tags = soup.find("div", class_ = "post-footers")
In [31]: links = tags.find_all('a',href=True)

结果 2

[<a href="index.html?page=2"> 2 </a>,
 <a href="index.html?page=3"> 3 </a>,
 <a href="index.html?page=4"> 4 </a>]

关于python - 使用 BeautifulSoup 调用特定的 'div' 元素，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41779130/

上一篇：python - 如何将单元格中的文本与正则表达式匹配并仅保留与正则表达式匹配的文本？

下一篇：python - pandas:基于相同的 ID 使用另一个数据帧的数据填充

相关文章：

键上 dict 正则表达式的 Python dict

python - Rust 等价于 Python 的 ljust() 字符串方法

Python 多进程/多线程用于并发文件复制操作

python - 在运行时测量 python 覆盖率

python - 如何使用 ENTER 输入打破循环

python - 程序运行后字典清空

Python 内部结构 - 对象如何了解全局变量？

python - Django(-CMS)用户警告: No registered apphook

windows - python cythonize 期间出现 "Intel\iCLS was unexpected"错误

Python函数不访问类变量