python - 脚本在抓取多个值时抛出错误

标签 python python-2.7 web-scraping beautifulsoup mechanize

我的脚本正在尝试从网站上抓取存储在文本文件中的 80000 个 ID。当我使用单个 o/p 运行代码时，它运行良好，但是当我将所有输入放入循环中时，我收到错误。

我的代码:

from bs4 import BeautifulSoup
import mechanize

br = mechanize.Browser()
response = br.open("https://www.matsugov.us/myproperty")

for form in br.forms():
    if form.attrs.get('name') == 'frmSearch':
        br.form = form
        break

br.form['ddlType']=["taxid"]

with open("names.txt") as ins:
    tx = ins.read().splitlines()

    for x in tx:
        br['txtParm'] = x
        req = br.submit().read()
        soup = BeautifulSoup(req, 'html.parser')
        table = soup.find('td', {'class': 'Grid_5'})

        for row in table:
            print row

错误:

AttributeError: mechanize._mechanize.Browser 实例没有属性 __setitem__ (也许您忘记了 .select_form()？)

最佳答案

你把循环放错了!将其放在 mechanize 浏览器下方，然后在末尾放置一个 try catch。它会起作用的。一旦你尝试过并且有效，请发表评论。

关于python - 脚本在抓取多个值时抛出错误，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47383106/

上一篇：php - 编写此 PHP 代码的更有效方法，以及如何实例化类似于 Python 列表的 PHP 数组

下一篇：python - 如何创建一个文件并在其中保存抓取的数据？

python - 让 virtualenv 从你的全局站点包中继承特定的包

python - 有没有一种简单的方法可以从某个点开始拆分字符串？

python - 如何处理 MySQL-Python 中的撇号？

python - 在字典中连接键和值

python - 使用 python 终端关闭程序

python - Spectrum S3 访问被拒绝

python - LoadLibrary(pythondll) failed error using py2exe教程

jquery - 有 jQuery 网页抓取工具吗？

python - 网络抓取最常见的名字