javascript - 从标签 python 的 onclick 属性获取 URL

标签 javascript python html selenium web-scraping

我正在尝试使用 selenium python 访问标签的 onclick 属性中存在的 URL。它存在于 javascript 函数中。我已经尝试了各种技术来做到这一点,但我还没有找到解决方案。 我尝试使用 execute_script 方法执行点击功能。我也试过 get_attribute 来获取 onclick 函数,但它没有返回。 我想访问 openPopUpFullScreen 函数中存在的 url

这是 html:

<td class="tdAction">
<div class="formResponseBtn icon-only">
<a href="#fh" onclick="javascript: openPopUpFullScreen('/esop/toolkit/negotiation/rfq/publicRfqSummaryReport.do?rfqId=rfq_229969', '');" class="openNewPage" title="Open a new window to view > View or download a Summary of this PQQ/ITT which includes details of the PQQ/ITT settings, format and questions">
<img src="/esop_custom/images/buttons/print_button.png" title="Open a new window to view > View or download a Summary of this PQQ/ITT which includes details of the PQQ/ITT settings, format and questions" alt="Open a new window to view > View or download a Summary of this PQQ/ITT which includes details of the PQQ/ITT settings, format and questions"><img src="/esop_custom/images/buttons/openNewWindow_button.png" title="(Opens in new window)" alt="(Opens in new window)">
</a>
</div>
</td>

这是python代码:

url=browser.find_element_by_xpath("//img[@title='Open a new window to view > View or download a Summary of this PQQ/ITT which includes details of the PQQ/ITT settings, format and questions']").click()
print(browser.current_url)
#it returns the previous page I am at.

还有一个:

id=browser.find_element_by_css_selector(".openNewPage").get_attribute("onclick")
print(id)
#it returns none

我需要 openPopUpFullScreen 函数中存在的 URL,但我无法弄清楚什么是完成此任务的正确解决方案。

更新:我也尝试过使用 beautifulsoup 来提取 onclick 函数,但它似乎没有出现:

这是我的代码:

content = browser.page_source.encode('utf-8').strip()
soup = BeautifulSoup(content,"html.parser")
res = soup.find("a",{"class":"openNewPage"})
print(res)
#it returns the complete tag but it does not contain onclick attribute
#i tried using this
res = soup.find("a",{"class":"openNewPage"})[onclick]
#it returns an error NameError: name 'onclick' is not defined

最佳答案

下方

from bs4 import BeautifulSoup


html = '''<td class="tdAction">
<div class="formResponseBtn icon-only">
<a href="#fh" onclick="javascript: openPopUpFullScreen('/esop/toolkit/negotiation/rfq/publicRfqSummaryReport.do?rfqId=rfq_229969', '');" class="openNewPage" title="Open a new window to view > View or download a Summary of this PQQ/ITT which includes details of the PQQ/ITT settings, format and questions">
<img src="/esop_custom/images/buttons/print_button.png" title="Open a new window to view > View or download a Summary of this PQQ/ITT which includes details of the PQQ/ITT settings, format and questions" alt="Open a new window to view > View or download a Summary of this PQQ/ITT which includes details of the PQQ/ITT settings, format and questions"><img src="/esop_custom/images/buttons/openNewWindow_button.png" title="(Opens in new window)" alt="(Opens in new window)">
</a>
</div>
</td>'''


soup = BeautifulSoup(html, features="lxml")
a = soup.find('a')
onclick = a.attrs['onclick']
left = onclick.find("'")
right = onclick.find("'",left+1)
print('URL is: {}'.format(onclick[left+1:right]))

输出

URL is: /esop/toolkit/negotiation/rfq/publicRfqSummaryReport.do?rfqId=rfq_229969

关于javascript - 从标签 python 的 onclick 属性获取 URL,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58443772/

相关文章:

javascript - 异步等待无法正常工作

python - 在 numpy(和数字格式)中查找小矩阵的零空间的一种简单的类似 matlab 的方法

python - 尝试在 Python 服务器上使用 SSLContext 对象包装套接字时遇到问题

javascript - 每次我点击 jQuery.append 函数时,HTML 表单都会加倍

HTML 导航,在等列中有下拉菜单

php - Ajax 请求卡住网页

javascript - $scope 未在服务 AngularJs : 中定义

python - 根据某些条件(在 Python 中)将字符串(行)添加到特定位置

javascript - 获取鼠标悬停在 Three.js 上的球体的 xyz

JavaScript 延迟导航直到回调/超时