javascript - 无法在 html 代码中找到表单 - 使用 Python 和 Selenium 进行 Web 抓取

标签 javascript python forms selenium web-scraping

我正在尝试废弃 this website , 但有一些表格需要填写。

主要目的是填写这 5 个表格(选择一个后出现)并通过“咨询”按钮下载数据。

此表单是用 javascript 编码的,我无法在页面的 html 代码中找到它们。当我通过 Google Chrome 检查框架时,我找到了表单 ID,但代码找不到它们。

我只有我的代码原型(prototype)。如果不知道如何找到这些表格,我就无法前进。

from selenium import webdriver
from bs4 import BeautifulSoup
import time
import os

#Variables

url = 'http://www.anbima.com.br/pt_br/informar/sistema-reune.htm'
path_phantom = 'C:\\Users\\TBMEPYG\\AppData\\Local\\Continuum\\Anaconda3\\Lib\\site-packages\\phantomjs-2.1.1-windows\\bin\\phantomjs.exe'

#Processing

driver = webdriver.PhantomJS(executable_path= path_phantom)

driver.get(url)

data = driver.find_element_by_id('data_ref')

data.send_keys("21/08/2017")

driver.quit()

编辑:

我将代码更新为:

    from selenium import webdriver

    path_phantom = 'C:\\Users\\TBMEPYG\\AppData\\Local\\Continuum\\Anaconda3\\Lib\\site-packages\\phantomjs-2.1.1-windows\\bin\\phantomjs.exe'
    driver = webdriver.PhantomJS(executable_path= path_phantom)
    driver.get('http://www.anbima.com.br/reune/reune.asp')


    driver.switch_to.frame(driver.find_element_by_xpath('//iframe[@class="full"]'))
    data = driver.find_element_by_name('Dt_Ref')
    data.clear()
    data.send_keys('21/08/

我得到了这个错误:

CD: C:\Users\TBMEPYG\AppData\Local\Continuum\Anaconda3
Current directory: C:\Users\TBMEPYG\AppData\Local\Continuum\Anaconda3
python "C:\Users\TBMEPYG\Desktop\vamo.py"
Process started >>>
Traceback (most recent call last):
  File "C:\Users\TBMEPYG\Desktop\vamo.py", line 8, in <module>
    data = driver.find_element_by_name('Dt_Ref')
  File "C:\Users\TBMEPYG\AppData\Local\Continuum\Anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 426, in find_element_by_name
    return self.find_element(by=By.NAME, value=name)
  File "C:\Users\TBMEPYG\AppData\Local\Continuum\Anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 832, in find_element
    'value': value})['value']
  File "C:\Users\TBMEPYG\AppData\Local\Continuum\Anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 297, in execute
    self.error_handler.check_response(response)
  File "C:\Users\TBMEPYG\AppData\Local\Continuum\Anaconda3\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 194, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: {"errorMessage":"Unable to find element with name 'Dt_Ref'","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"89","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:62040","User-Agent":"Python http auth"},"httpVersion":"1.1","method":"POST","post":"{\"using\": \"name\", \"value\": \"Dt_Ref\", \"sessionId\": \"bdd3fc70-8dd0-11e7-aeb1-85b8cfbe0d1c\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/bdd3fc70-8dd0-11e7-aeb1-85b8cfbe0d1c/element"}}
Screenshot: available via screen

编辑2:

另一种可能性是使用主页面内的链接 http://www.anbima.com.br/reune/reune.asp

当我把代码改成这样的时候,我又遇到了一个错误

from selenium import webdriver

path_phantom = 'C:\\Users\\TBMEPYG\\AppData\\Local\\Continuum\\Anaconda3\\Lib\\site-packages\\phantomjs-2.1.1-windows\\bin\\phantomjs.exe'
driver = webdriver.PhantomJS(executable_path= path_phantom)
driver.get('http://www.anbima.com.br/reune/reune.asp')


data = driver.find_element_by_name('Dt_Ref')
data.clear()
data.send_keys('21/08/2017')

错误:

Traceback (most recent call last):
  File "C:\Users\TBMEPYG\Desktop\vamo.py", line 9, in <module>
    data = driver.find_element_by_name('Dt_Ref')
  File "C:\Users\TBMEPYG\AppData\Local\Continuum\Anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 426, in find_element_by_name
    return self.find_element(by=By.NAME, value=name)
  File "C:\Users\TBMEPYG\AppData\Local\Continuum\Anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 832, in find_element
    'value': value})['value']
  File "C:\Users\TBMEPYG\AppData\Local\Continuum\Anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 297, in execute
    self.error_handler.check_response(response)
  File "C:\Users\TBMEPYG\AppData\Local\Continuum\Anaconda3\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 194, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: {"request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"89","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:61820","User-Agent":"Python http auth"},"httpVersion":"1.1","method":"POST","post":"{\"using\": \"name\", \"value\": \"Dt_Ref\", \"sessionId\": \"e61dd170-8dcf-11e7-a019-41573671066b\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/e61dd170-8dcf-11e7-a019-41573671066b/element"}}
Screenshot: available via screen

最佳答案

为了能够处理 form 中的元素,您需要先切换到 iframe:

driver.switch_to.frame(driver.find_element_by_xpath('//iframe[@class="full"]'))
data = driver.find_element_by_name('Dt_Ref')
data.clear()
data.send_keys('21/08/2017')

关于javascript - 无法在 html 代码中找到表单 - 使用 Python 和 Selenium 进行 Web 抓取,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45969130/

相关文章:

javascript - 排序问题 - 在 Javascript 中索引多维数组

javascript - 带索引的 Ng 样式

python - 根据标准分析一列数据框的数据

javascript - 提交按钮操作取决于字段是否与另一个表单匹配

javascript - 当用户单击提交时如何验证 MDL 表单?

javascript - 将 JavaScript 变量发送到片段着色器

javascript - JSON @属性

Python - 比较本地和远程两个文件的最后修改日期

javascript - Send_keys 函数确实按 Selenium python 中的预期工作

jquery - $ ("form").submit(function() { 在 Firefox 中不起作用