我想从网站获取数据,但响应仅显示错误。
我尝试将 url 更改为 http,这导致了 405 错误,并尝试将 data=json.dumps(data) 更改为 data = data,但它们都不起作用。
import requests
import json
from bs4 import BeautifulSoup
request_url = 'https://www.kinds.or.kr/news/newsResult.do'
data = {"jsonSearchParam": {"indexName": "news", "searchKey": "sky", "searchKeys": [{}], "byLine": "", "searchFilterType": "1", "searchScopeType": "1", "mainTodayPersonYn": "", "startDate": "2019-05-06", "endDate": "2019-08-06", "newsIds": [
], "categoryCodes": [], "incidentCodes": [], "networkNodeType": "", "topicOrigin": ""}, "index-name": "news", "N": "", "search-keyword": "sky", "search-index-type": "news", "dict-type": "texanomy", "dict-concat": "OR"}
response = requests.post(request_url, data=json.dumps(data))
html = response.text
soup = BeautifulSoup(html, 'html.parser')
flist = soup.find_all('span')
print(response)
我希望得到适当的回应。
最佳答案
您的网址似乎不正确。查看 Firefox 开发者工具,正确的搜索 URL 是 'https://www.kinds.or.kr/v2/news/search.do'
。参数“jsonSearchParam”需要是一个json字符串,因此我们对其使用json.dumps()
:
import json
import requests
from bs4 import BeautifulSoup
# request_url = 'https://www.kinds.or.kr/news/newsResult.do'
request_url = 'https://www.kinds.or.kr/v2/news/search.do' # <-- correct URL
d = {"indexName": "news", "searchKey": "sky", "searchKeys": [{}], "byLine": "", "searchFilterType": "1", "searchScopeType": "1", "mainTodayPersonYn": "", "startDate": "2019-05-06", "endDate": "2019-08-06", "newsIds": [], "categoryCodes": [], "incidentCodes": [], "networkNodeType": "", "topicOrigin": ""}
data = {"jsonSearchParam": json.dumps(d), "index-name": "news", "N": "", "search-keyword": "sky", "search-index-type": "news", "dict-type": "texanomy", "dict-concat": "+OR+"}
response = requests.post(request_url, data=data)
print(response)
soup = BeautifulSoup(response.text, 'lxml')
flist = soup.find_all('span')
print(flist)
打印:
<Response [200]>
[<span class="sr-only">Toggle navigation</span>, <span class="icon-bar"></span>, <span class="icon-bar"></span>, <span class="icon-bar"></span>, <span aria-hidden="true">×</span>, <span class="input-group-addon">
<i class="fal fa-envelope"></i>
</span>, <span class="input-group-addon">
...and so on.
关于python - 如何修复发送发布请求时的 <Response [404]> 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57372442/