我正在尝试阅读以下内容:
我的目标是阅读此页面上的每个职位 - https://www.cvbankas.lt/?miestas=Vilnius&padalinys%5B%5D=&keyw=python
我尝试过的:
import requests
from bs4 import BeautifulSoup
URL = 'https://www.cvbankas.lt/?miestas=Vilnius&padalinys%5B%5D=&keyw=python'
page = requests.get(URL).text
soup = BeautifulSoup(page, 'html.parser')
results = soup.find(id='ResultsContainer')
# Look for Python jobs
python_jobs = results.find_all("div", string=lambda t: "python" in t.lower())
for p_job in python_jobs:
link = p_job.find("h3")["href"]
print(p_job.text.strip())
print(f"Apply here: {link}\n")
但我收到以下错误:
AttributeError: 'NoneType' object has no attribute 'find_all'
如何阅读所有标题?
最佳答案
问题是,没有任何带有 id="ResultsContainer"
的标签。您可以搜索全部<h3>
带有文本Python的标签,然后找到父<a>
网址标签:
import requests
from bs4 import BeautifulSoup
URL = 'https://www.cvbankas.lt/?miestas=Vilnius&padalinys%5B%5D=&keyw=python'
page = requests.get(URL).text
soup = BeautifulSoup(page, 'html.parser')
results = soup.find_all('h3', text=lambda t: 'python' in t.lower())
for r in results:
print(r.text)
print(r.find_parent('a')['href'])
print('-' * 80)
打印:
Senior Python Developer
https://www.cvbankas.lt/senior-python-developer-vilniuje/1-6719819
--------------------------------------------------------------------------------
Full Stack Engineer (React + Python)
https://www.cvbankas.lt/full-stack-engineer-react-python-vilniuje/1-6665723
--------------------------------------------------------------------------------
Python programuotojas (Mid-Senior)
https://www.cvbankas.lt/python-programuotojas-mid-senior-vilniuje/1-6693547
--------------------------------------------------------------------------------
Python Developer
https://www.cvbankas.lt/python-developer-vilniuje/1-6604883
--------------------------------------------------------------------------------
关于python - NoneType 对象没有属性 find_all 使用 beautiful Soup 时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62995161/