首先,我一直在尝试从此网页获取下拉列表:http://solutions.3m.com/wps/portal/3M/en_US/Interconnect/Home/Products/ProductCatalog/Catalog/?PC_Z7_RJH9U5230O73D0ISNF9B3C3SI1000000_nid=RFCNF5FK7WitWK7G49LP38glNZJXPCDXLDbl
这是我的代码:
import urllib2
from bs4 import BeautifulSoup
import re
from pprint import pprint
from selenium import webdriver
url = 'http://solutions.3m.com/wps/portal/3M/en_US/Interconnect/Home/Products/ProductCatalog/Catalog/?PC_Z7_RJH9U5230O73D0ISNF9B3C3SI1000000_nid=RFCNF5FK7WitWK7G49LP38glNZJXPCDXLDbl'
element_xpath = '//*[@id="Component1"]'
driver = webdriver.PhantomJS()
driver.get(url)
element = driver.find_element_by_xpath(element_xpath)
element_xpath = '/option[@value="02"]'
all_options = element.find_elements_by_tag_name("option")
for option in all_options:
print("Value is: %s" % option.get_attribute("value"))
option.click()
source = driver.page_source.encode('utf-8', 'ignore')
driver.quit()
source = str(source)
soup = BeautifulSoup(source, 'html.parser')
print soup
打印出来的是这样的:
Traceback (most recent call last):
File "../../../../test.py", line 58, in <module>
Value is: XX
main()
File "../../../../test.py", line 46, in main
option.click()
File "/home/eric/dev/octocrawler-env/local/lib/python2.7/site-packages/selenium-2.33.0-py2.7.egg/selenium/webdriver/remote/webelement.py", line 54, in click
self._execute(Command.CLICK_ELEMENT)
File "/home/eric/dev/octocrawler-env/local/lib/python2.7/site-packages/selenium-2.33.0-py2.7.egg/selenium/webdriver/remote/webelement.py", line 228, in _execute
return self._parent.execute(command, params)
File "/home/eric/dev/octocrawler-env/local/lib/python2.7/site-packages/selenium-2.33.0-py2.7.egg/selenium/webdriver/remote/webdriver.py", line 165, in execute
self.error_handler.check_response(response)
File "/home/eric/dev/octocrawler-env/local/lib/python2.7/site-packages/selenium-2.33.0-py2.7.egg/selenium/webdriver/remote/errorhandler.py", line 158, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementNotVisibleException: Message: u'{"errorMessage":"Element is not currently visible and may not be manipulated","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"81","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:51413","User-Agent":"Python-urllib/2.7"},"httpVersion":"1.1","method":"POST","post":"{\\"sessionId\\": \\"30e4fd50-f0e4-11e3-8685-6983e831d856\\", \\"id\\": \\":wdc:1402434863875\\"}","url":"/click","urlParsed":{"anchor":"","query":"","file":"click","directory":"/","path":"/click","relative":"/click","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/click","queryKey":{},"chunks":["click"]},"urlOriginal":"/session/30e4fd50-f0e4-11e3-8685-6983e831d856/element/%3Awdc%3A1402434863875/click"}}' ; Screenshot: available via screen
而这一切中最奇怪、最令人愤怒的一点是,有时它实际上一切都奏效了。我不知道这里发生了什么。
更新
似乎我对其他网站上下拉表单的可见性没有问题,只有这个。有没有什么东西可以使表单不可见(如果是这样,为什么只有 95% 的时间)?是否存在加载页面时可能导致内容不可见的问题?
最佳答案
我是这样将 Selenium Webdriver 与 Python3 结合使用的:
all_options = self.driver.find_element_by_id("Component1")
options = all_options.find_elements_by_tag_name("option")
for each_option in all_options:
print(each_option.get_attribute("value"))
如果您尝试从文本框中选择内容,请按以下步骤操作:
select = Select(self.driver.find_element_by_id("Component1"))
select.select_by_visible_text("02")
关于python - 如何在 Selenium 的下拉列表中选择项目,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24150970/