我试图从 CaseIDs 数组中包含的所有案例中下载图像,但它不起作用。我希望代码能够在所有情况下运行。
from bs4 import BeautifulSoup
import requests as rq
from urllib.parse import urljoin
from tqdm import tqdm
CaseIDs = [100237, 99817, 100271]
with rq.session() as s:
for caseid in tqdm(CaseIDs):
url = 'https://crashviewer.nhtsa.dot.gov/nass-CIREN/CaseForm.aspx?xsl=main.xsl&CaseID= {caseid}'
r = s.get(url)
soup = BeautifulSoup(r.text, "html.parser")
url = urljoin(url, soup.find('a', text='Text and Images Only')['href'])
r = s.get(url)
soup = BeautifulSoup(r.text, "html.parser")
links = [urljoin(url, i['src']) for i in soup.select('img[src^="GetBinary.aspx"]')]
count = 0
for link in links:
content = s.get(link).content
with open("test_image" + str(count) + ".jpg", 'wb') as f:
f.write(content)
count += 1
最佳答案
尝试像这样使用format()
:
url = 'https://crashviewer.nhtsa.dot.gov/nass-CIREN/CaseForm.aspx?xsl=main.xsl&CaseID={}'.format(caseid)
关于python - 修改url参数以从多个网站下载图片,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59819738/