python - Mechanize br.submit() 限制?

标签 python mechanize form-submit

我的目的是使用 Mechanize 向网站提交搜索查询,并使用 BeautifulSoup 分析结果。这将用于同一个网站,因此表单名称等可以硬编码。我的初始查询有问题,如下所示:

import mechanize
import urllib2
#from bs4 import BeautifulSoup


def inspect_page(url):
    br = mechanize.Browser(factory=mechanize.RobustFactory())
    br.set_handle_robots(False)
    br.addheaders = [('User-agent',
                      'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6')]
    br.set_handle_redirect(mechanize.HTTPRedirectHandler)

    try:
        br.open(url)
    except mechanize.HTTPError, e:
        print "HTTP Error", e.code,
    except urllib2.URLError as e:
        print "URL Error", e.reason,
        return

    for form in br.forms():
        print form

    br.select_form(name="dataform")
    br.form['pcode'] = 'WV14 8EW'
    br.form['premise'] = '66'
    response = br.submit()
    print response.read()

    #soup = BeautifulSoup(response.read())

inspect_page('http://www.fensa.co.uk/asp/certificate.asp')

This did not redirect to the results page and print response.read() displayed the HTML of the page I submitted the query on, so I assumed I had made an error in my code. However when I tested another site (inspect_page('https://publicaccess.glasgow.gov.uk/online-applications/search.do?action=simple')) and changed the forms to match those on the site:

`br.select_form(name="searchCriteriaForm")
br.form['searchCriteria.simpleSearchString'] = 'Queen Elizabeth Gardens'
response = br.submit()
print response.read()`    

正如我所料,我被重定向了。当 br.submit() 时,有什么可以阻止页面被重定向的吗?叫做?我已经检查过该站点不是 GZipped。

最佳答案

表单操作仅在通过 JavaScript 验证表单输入时在页面上更改,因此我现在将字段直接提交到该 URL。

`params = {'pcode': "WV14 8EW", 'premise': "66"}
data = urllib.urlencode(params)
request = mechanize.Request(certificate_results.asp)
response = mechanize.urlopen(request, data=data)`

感谢@BlackJack 的提示

关于python - Mechanize br.submit() 限制?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30642462/

相关文章:

python - 连续绘图(opencv)

python - 为什么 python 请求会出现 404 错误?

python - Python 的 mechanize 模块错误

php - HTML PHP 在复选框上输入选中属性而不提交表单

python - django npm 和 Node 包架构

ruby - 使用 Mechanize 的异步请求

python - 能不能只获取网页的header信息,不获取body? ( Mechanize )

php - 用 PHP 文件中的内容替换 div 内容

jQuery:如果提交按钮之前的字段为空,则阻止表单提交

Python脚本: Removing tabs,用逗号分隔列,列出最大/最小