python - 使用 Python 抓取网页

标签 python selenium web-scraping beautifulsoup

我正在使用 selenium 来访问我的学校成绩，之后我希望能够从网站上获取我的成绩，但我不知道如何

这是我的登录代码:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.PhantomJS("C:\Python27\phantomjs-1.9.0-windows\phantomjs.exe") 
driver.get("https://ps.rsd.edu/public/")

elem = driver.find_element_by_name("account")
elem.send_keys("Username")
elem2 = driver.find_element_by_name("pw")
elem2.send_keys("Password")
elem.send_keys(Keys.RETURN)

driver.quit()

print "done"

我认为最简单的方法是使用 Beautifulsoup，但我不确定

最佳答案

我将在这里回答这个问题，因为另一个问题是关于如何使用 Beautifulsoup 解析表格。

因此给定表 http://gist.github.com/C-Dubb/5522909

for cell in driver.find_elements_by_css_selector(".grid tr a[href$='fg=S2']"):
    print cell.text
    # if you want the number only, you need to strip the grades here
    # also need to check if S2 cell is empty or not

关于python - 使用 Python 抓取网页，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/16390262/

上一篇：haskell - 给定一串整数和一个目标数字，打印出元组及其结果的所有可能组合

下一篇：python - NLTK:过滤具有特定结构的句子

相关文章：

python - 使用 python 抓取 ajax 页面

javascript - 来自 Request NodeJS 的 body 的正确编码

python - Beautifulsoup 获取汤中未选择的元素

python - 从 matplotlibrc 文件加载 matplotlib rcparams 时，Jupyter notebook 内联绘图中断

python - Pandas - 在 zip 存档中保存多个 CSV

python - 如何修复 Windows 10 中的 'Pyautogui.mouseUP' 和 mouseDown 问题

java - Selenium Chrome 驱动程序无法打开

python - 如何在不在 iPython 笔记本中的离线模式下使用 plotly 进行绘图？

c# - 如果我已经拥有 FindsBy 的元素，如何将 Wait.Until 与 Selenium 一起使用

java - 当从 Selenium 触发时，Google Chrome 无法导航到指定的 URL