python - 如何从我 BeautifulSoup 结果中删除标签(例如 : Address = [a, b,c,d,r......])

标签 python dataframe beautifulsoup

如何从 beautifulsoup 结果中删除标签 (如:地址= [a,b,c,d,r......])

from bs4 import BeautifulSoup as bs
import requests
    #
url = 'https://www.planetware.com/tourist-attractions-/oslo-n-osl-oslo.htm'
url_get = requests.get(url)
soup = bs(url_get.content, 'html.parser')
#
address=soup.find_all('p', class_="nospc")
address
<p class="nospc">Address: Nobels gate 32, N-0268 Oslo</p>,
<p class="nospc">Address: Akershus Festning, 0015 Oslo</p>,
<p class="nospc">Address: Frederiks gate 2, 0164 Oslo</p>,
<p class="nospc">Address: Universitetsgata 13, Oslo</p>,
<p class="nospc">Address: Tøyengata 53, 0578 Oslo</p>,
<p class="nospc">Address: Bellevue, Oslo</p>,
<p class="nospc">Address: Frederiks gate 2, 0164 Oslo</p>,
<p class="nospc">Address: Bygdøynesveien 39, 0286 Oslo</p>,
<p class="nospc">Address: Kongeveien 5, 0787 Oslo</p>,
<p class="nospc">Address: Karl Johansgt. 11, 0154 Oslo</p>,
<p class="nospc">Address: Rådhuset, 0037 Oslo</p>,
<p class="nospc">Address: Bryggegata 9, 0120 Oslo</p>,
<p class="nospc">Address: Sars gate 1, 0562 Oslo</p>,
<p class="nospc">Address: Kirsten Flagstads Plass 1, 0150 Oslo</p>]

我想要类似的东西

Address = ['Nobels gate 32, N-0268 Oslo', 'Akershus Festning, 0015 Oslo' ...]

最佳答案

尝试以下代码。它将拆分地址部分。

url = 'https://www.planetware.com/tourist-attractions-/oslo-n-osl-oslo.htm'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'html.parser')
#
address=soup.find_all('p', class_="nospc")
addrlist=[]
for addr in address:
    addrlist.append(addr.text.split(':')[1].strip())

print(addrlist)

输出:

['Nobels gate 32, N-0268 Oslo', 'Akershus Festning, 0015 Oslo', 'Frederiks gate 2, 0164 Oslo', 'Universitetsgata 13, Oslo', 'Tøyengata 53, 0578 Oslo', 'Bellevue, Oslo', 'Frederiks gate 2, 0164 Oslo', 'Bygdøynesveien 39, 0286 Oslo', 'Kongeveien 5, 0787 Oslo', 'Karl Johansgt. 11, 0154 Oslo', 'Rådhuset, 0037 Oslo', 'Bryggegata 9, 0120 Oslo', 'Sars gate 1, 0562 Oslo', 'Kirsten Flagstads Plass 1, 0150 Oslo']

关于python - 如何从我 BeautifulSoup 结果中删除标签(例如 : Address = [a, b,c,d,r......]),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56456777/

相关文章:

python - 等到元素不存在

json - 从列表中逐行构建数据框

python - 在 Pandas MultiIndex DataFrame 中选择行

python - 使用 Beautiful Soup 解析 html 表单输入标签

python - key 错误 : -1 when appending new tag to soup in bs4

python - Beautiful Soup 嵌套标签搜索

python - 玛雅·皮梅尔 : pass fileDialog2's return to a UI textfield

python - 在 vps 服务器上优化 Django 应用程序

python - 如何运行一个本地主机,该主机可以一直提供服务,直到关键字将其停止为止(全部来自.py文件)?

python - 如果值在另一列中,则来自另一个 DataFrame 的 pandas 列?