python - 如何从 python 中的 HTML 表格中的特定单元格获取数据？

标签 python html parsing beautifulsoup

This link contains the table I'm trying to parse. 我正在尝试在 Python 中使用 BeautifulSoup 。我对 BeautifulSoup 和 HTML 非常陌生。这是我解决问题的尝试。

soup = BeautifulSoup(open('BBS_student_grads.php'))

data = []
table = soup.find('table')
rows = table.find_all('tr') #array of rows in table 

for x,row in enumerate(rows[1:]):# skips first row 
    cols = row.find_all('td')    # finds all cols in rows
    for y,col in enumerate(cols): # iterates through col
        data.append([])
        data[x].append(col)       # puts table into a 2d array called data

print(data[0][0])                 #prints top left corner

Sample Output

我试图提取表中的所有名称，然后更新列表中的名称，然后更新表。我还使用此 HTML 的本地副本。临时修复，直到我学会如何进行更多网络编程。

非常感谢您的帮助

最佳答案

我认为您只需要 tr 元素中的 td 元素与 class="searchbox_black"。

您可以使用CSS Selectors获取所需的 td 元素:

for cell in soup.select('tr.searchbox_black td'):
    print cell.text

它打印:

BB Salsa

 Adams State University Alamosa, CO               
              Sensei: Oneyda Maestas               
              Raymond Breitstein               

...

关于python - 如何从 python 中的 HTML 表格中的特定单元格获取数据？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28911361/

上一篇：python - 有没有办法在 Python 2.7 或 numpy 中将图像数据保存为列表/数组？

下一篇：python - Pandas 基于列的多条件函数

相关文章：

python - 从变量名列表构建函数签名

python - 字符串数组数据需要去掉美元符号并转换为 float

python - 当函数使用 lambda 参数时如何使用模拟

html - 如何从 WordPress 元素中编码的图像 html 标签中删除通用 css 属性？

使用 ffmpeg 从 .mov 逐帧解析

python - literal_eval(f'{}') 会被滥用来执行来自外部源的代码吗？

html - Position Fixed 'button' ，旋转90度，固定在屏幕右侧

javascript - 尝试将代码放入 exe 演示

ios - 图像 URL JSON 在应用程序错误中解析为 UIImageView

java - 我可以解析日期以适合字符串数组吗？