python - 使用python请求登录后无法下载文件

标签 python python-requests

我正在尝试使用 python requests 模块下载文件,但首先登录到该站点。我可以登录,但是当我发送下载文件的获取请求时,它会再次显示登录页面。

代码:

login_url = 'https://seller.flipkart.com/login'
manifest_url = 'https://seller.flipkart.com/order_management/manifest.pdf'

username = '<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="0d787e687f636c60684d6a606c6461236e6260" rel="noreferrer noopener nofollow">[email protected]</a>'
password = 'password'

params = {'sellerId':'seller_id'}
payload = {'authName':'flipkart',
           'username':username,
           'password':password}

ses = requests.Session()
ses.post(login_url, data=payload, headers={'Content-Type':'application/x-www-form-urlencoded','Connection':'keep-alive'})
response = ses.get(manifest_url, params=params, headers={'Content-Type':'application/pdf','Connection':'keep-alive'})

print response.status_code
print response.url
print response.content

运行此代码时,我将登录页面的 html 作为内容。 我使用 fiddler 并得到以下数据:

Request URL: https://seller.flipkart.com/order_management/manifest.pdf?sellerId=seller_id
Request Method: GET
sellerId: seller_id

# Request Headers

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36
Referer: https://seller.flipkart.com/order_management?sellerId=seller_id
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8 

# Response Headers

Server: nginx
Date: Wed, 30 Dec 2015 13:12:31 GMT
Content-Type: application/pdf
Content-Length: 3652
Connection: keep-alive
X-XSS-Protection: 1; mode=block
strict-transport-security: max-age=31536000; preload
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: nosniff
Cache-Control: private, no-cache, no-store, must-revalidate
Expires: -1
Pragma: no-cache
X-Req-Id: REQ-14d7434a-e429-40e4-801f-6010d7c0b48c
X-Host-Id: 0008
content-disposition: attachment; filename=Manifest-seller_id-30-Dec-2015-18-42-30.pdf
vary: Accept-Encoding

如何下​​载文件?

最佳答案

设置stream=True,然后将内容写入文件。

import re 

# Send request by setting 'stream=True'
r = ses.get(manifest_url, ..., stream=True)

# Fetch filename
d = r.headers['content-disposition']
fname = re.findall("filename=(.+)", d)

# Write content to file
with open(fname, 'wb') as f:
    for chunk in r.iter_content(chunk_size=1024): 
        if chunk: # filter out keep-alive new chunks
            f.write(chunk)

Docs .

关于python - 使用python请求登录后无法下载文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34542123/

相关文章:

python - 尽管用户具有权限,但 ffmpeg 权限被拒绝

python - 在 DataFrame 中将 Pandas 系列转换为 DateTime

python 请求帖子不起作用

multithreading - 多线程 HTTP GET 请求在大约 900 次下载后严重减慢

python - 如何让我的代码停止在网络爬虫中打印关键字

python - 使用 requests/urllib3 在每次重试时添加回调函数

python - Julia 等效于 nb_conda_kernels 以区分环境

python - 如何用Pyqt5 QtMultimedia播放声音?

python - 为什么这个 if 语句会成功?

带计时器和字符串替换的 Python 请求