我有以下代码试图在 this 上抓取主表页。我需要在第 2 和第 4 列获取 NORAD ID 和发射日期。但是,我无法让 BeutifulSoup 通过其 ID 找到该表。
import requests
from bs4 import BeautifulSoup
data = []
URL = 'https://www.n2yo.com/satellites/?c=52&srt=2&dir=1'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
table = soup.find("table", id="categoriestab")
rows = table.find_all('tr')
for row in rows:
cols = row.find_all('td')
cols = [ele.text.strip() for ele in cols]
data.append([ele for ele in cols if ele]) # Get rid of empty values
print(data)
最佳答案
获取NORAD ID
和Launch date
,你可以试试:
import pandas as pd
url = "https://www.n2yo.com/satellites/?c=52&srt=2&dir=0"
df = pd.read_html(url)
data = df[2].drop(["Name", "Int'l Code", "Period[minutes]", "Action"], axis=1)
print(data)
输出将是:
关于python - 表没有正确抓取 python BeautifulSoup,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62482020/