如何从 beautifulsoup 结果中删除标签 (如:地址= [a,b,c,d,r......])
from bs4 import BeautifulSoup as bs
import requests
#
url = 'https://www.planetware.com/tourist-attractions-/oslo-n-osl-oslo.htm'
url_get = requests.get(url)
soup = bs(url_get.content, 'html.parser')
#
address=soup.find_all('p', class_="nospc")
address
<p class="nospc">Address: Nobels gate 32, N-0268 Oslo</p>,
<p class="nospc">Address: Akershus Festning, 0015 Oslo</p>,
<p class="nospc">Address: Frederiks gate 2, 0164 Oslo</p>,
<p class="nospc">Address: Universitetsgata 13, Oslo</p>,
<p class="nospc">Address: Tøyengata 53, 0578 Oslo</p>,
<p class="nospc">Address: Bellevue, Oslo</p>,
<p class="nospc">Address: Frederiks gate 2, 0164 Oslo</p>,
<p class="nospc">Address: Bygdøynesveien 39, 0286 Oslo</p>,
<p class="nospc">Address: Kongeveien 5, 0787 Oslo</p>,
<p class="nospc">Address: Karl Johansgt. 11, 0154 Oslo</p>,
<p class="nospc">Address: Rådhuset, 0037 Oslo</p>,
<p class="nospc">Address: Bryggegata 9, 0120 Oslo</p>,
<p class="nospc">Address: Sars gate 1, 0562 Oslo</p>,
<p class="nospc">Address: Kirsten Flagstads Plass 1, 0150 Oslo</p>]
我想要类似的东西
Address = ['Nobels gate 32, N-0268 Oslo', 'Akershus Festning, 0015 Oslo' ...]
最佳答案
尝试以下代码。它将拆分地址部分。
url = 'https://www.planetware.com/tourist-attractions-/oslo-n-osl-oslo.htm'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'html.parser')
#
address=soup.find_all('p', class_="nospc")
addrlist=[]
for addr in address:
addrlist.append(addr.text.split(':')[1].strip())
print(addrlist)
输出:
['Nobels gate 32, N-0268 Oslo', 'Akershus Festning, 0015 Oslo', 'Frederiks gate 2, 0164 Oslo', 'Universitetsgata 13, Oslo', 'Tøyengata 53, 0578 Oslo', 'Bellevue, Oslo', 'Frederiks gate 2, 0164 Oslo', 'Bygdøynesveien 39, 0286 Oslo', 'Kongeveien 5, 0787 Oslo', 'Karl Johansgt. 11, 0154 Oslo', 'Rådhuset, 0037 Oslo', 'Bryggegata 9, 0120 Oslo', 'Sars gate 1, 0562 Oslo', 'Kirsten Flagstads Plass 1, 0150 Oslo']
关于python - 如何从我 BeautifulSoup 结果中删除标签(例如 : Address = [a, b,c,d,r......]),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56456777/