我有以下蜘蛛,它几乎只是应该发布到表单。 但我似乎无法让它发挥作用。当我通过 Scrapy 执行此操作时,响应从未显示。 有人可以告诉我我哪里出了问题吗?
这是我的蜘蛛代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import scrapy
from scrapy.http import FormRequest
from scrapy.shell import inspect_response
class RajasthanSpider(scrapy.Spider):
name = "rajasthan"
allowed_domains = ["rajtax.gov.in"]
start_urls = (
'http://www.rajtax.gov.in/',
)
def parse(self, response):
return FormRequest.from_response(
response,
formname='rightMenuForm',
formdata={'dispatch': 'dealerSearch'},
callback=self.dealer_search_page)
def dealer_search_page(self, response):
yield FormRequest.from_response(
response,
formname='dealerSearchForm',
formdata={
"zone": "select",
"dealertype": "VAT",
"dealerSearchBy": "dealername",
"name": "ana"
}, callback=self.process)
def process(self, response):
inspect_response(response, self)
当我用 Splash 替换我的 dealer_search_page()
时:
def dealer_search_page(self, response):
yield FormRequest.from_response(
response,
formname='dealerSearchForm',
formdata={
"zone": "select",
"dealertype": "VAT",
"dealerSearchBy": "dealername",
"name": "ana"
},
callback=self.process,
meta={
'splash': {
'endpoint': 'render.html',
'args': {'wait': 0.5}
}
})
我收到以下警告:
2016-03-14 15:01:29 [scrapy] WARNING: Currently only GET requests are supported by SplashMiddleware; <POST http://rajtax.gov.in:80/vatweb/dealerSearch.do> will be handled without Splash
程序在到达 process()
函数中的 inspect_response()
之前退出。
该错误表明 Splash 尚不支持 POST
。
Splash
是否适用于此用例,或者我应该使用 Selenium
?
最佳答案
现在 Splash 支持 POST 请求。尝试 SplashFormRequest
或 {'splash':{'http_method':'POST'}}
关于python - 使用 Splash Scrapy POST 到 Javascript 生成的表单,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35968831/