python - 美丽汤 "maximum recursion depth exceeded while calling a Python object"

标签 python beautifulsoup

我正在尝试执行以下操作:

request = urllib2.Request(url=url, headers={ 'User-Agent' : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT' })
response = urllib2.urlopen(request)
HTML_response = response.read()
response.close()
return BeautifulSoup(HTML_response)

但是,在某些页面上(总是相同的页面,但看起来顺序不是问题)我明白了

Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/queues.py", line 268, in _feed
    send(obj)
  File "/usr/local/lib/python2.7/dist-packages/BeautifulSoup.py", line 439, in __getnewargs__
    return (NavigableString.__str__(self),)
RuntimeError: maximum recursion depth exceeded while calling a Python object

确实存在,所以我认为执行 except urllib2.HTTPError: 不会有帮助

最佳答案

In [1]: import urllib2

In [2]: from BeautifulSoup import BeautifulSoup

In [3]: url = 'http://www.sparklebox.co.uk/topic/creative-arts/art-and-design/colouring-pages.html'

In [4]: request = urllib2.Request(url=url, headers={ 'User-Agent' : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT' })

In [5]: response = urllib2.urlopen(request)    
In [6]: HTML_response = response.read()    
In [7]: b1 = BeautifulSoup(HTML_response)    
In [8]: print type(b1)
<class 'BeautifulSoup.BeautifulSoup'>

它与BeautifulSoup 3.2一起工作正常

关于python - 美丽汤 "maximum recursion depth exceeded while calling a Python object",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9206278/

相关文章:

python - 谷歌计算引擎示例

python - Django makemessages 将翻译后的字符串标记为模糊

web-scraping - 如何在 BeautifulSoup 中只获取标签的内部文本,不包括嵌入的?

python - 记录: How to filter INFO messages from request library?

python - 查询在 MySQL 工作台中有效,但在 MySQL 连接器中无效

python pyplot : colorbar on contourf and scatter in same plot

python - 使用 beautifulsoup 和 python 删除某些标签

python - beautifulsoup:如何获取表头中元素的索引

python - 使用 BeautifulSoup 为每个子页面抓取数据 - url 很长且格式不同

python - 解析 Wunderground 中的 HTML 数据