我正在尝试从网站下载 Excel 文件。 我使用 mechanize 成功填写了表格,提交表格后应该会返回文件下载。 但在下载时,它返回的是 html,而不是文件的实际内容。
import mechanize
br = mechanize.Browser()
br.open("http://web.sba.gov/pro-net/search/dsp_dsbs.cfm")
br.select_form('SearchForm')
br["States"] = ["AL","AK"]
br["E8a"] = ["Y"]
br["Report"] = ["S"]
response = br.submit()
fileobj = open("szz.txt","wb")
fileobj.write(response.read())
fileobj.close()
结果看起来像
<!doctype html>
<html lang="en-US" dir="ltr">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=Edge">
<title>SBA - Dynamic Search</title>
<link href="/gls/dsp_choosefunction.cfm" accesskey="1" rel="Home" title="Home (Return to GLS Choose Function)">
<link rel="stylesheet" type="text/css" media="all" href="/library/css/jquery.mobile/sba.dtv.css?CachedAsOf=2012-06-20T22:15"/><!-- local code -->
<link rel="stylesheet" type="text/css" media="all" href="/library/css/sczz.strict.css?CachedAsOf=2013-09-20T18:55"/>
<script src="/library/javascripts/jquery/jquery.js?CachedAsOf=2012-09-21T15:37"></script><!-- 1.8.2 -->
<script src="/library/javascripts/jquery/jquery.mobile/sba.jqm.js?CachedAsOf=2013-03-28T16:11"></script><!-- local code -->
<noscript>
<link rel="stylesheet" type="text/css" media="all" href="/library/css/sczz.noscript.css?CachedAsOf=2010-10-14T19:23"/>
</noscript>
<script>
var gSlafDevTestProd = "Prod";
var gSlafDevTestProdInd = "2";
var gSlafInlineBlock = "inline-block";
最佳答案
我在您的代码中发现了一些错误,我尝试了以下代码并在浏览器中打开文件显示了一个漂亮的表格,所以请尝试一下:
import mechanize
br = mechanize.Browser()
br.open("http://web.sba.gov/pro-net/search/dsp_dsbs.cfm")
br.select_form('SearchForm')
br.form["State"] = ["AL","AK"]
br.form["E8a"] = ["Y"]
br.form["Report"] = ["S"]
response = br.submit()
fileobj = open("szz.html","wb")
fileobj.write(response.read())
fileobj.close()
基本上,您需要调用br.form[control_name]
,并且您在“States”键上犯了一个错误,它只是“State”,现在将文件另存为.html
并在浏览器中打开它,看看这是否是您正在寻找的内容。
关于Python Mechanize 文件下载,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20412157/