我想使用selenium和python来抓取这个网站:https://ntrl.ntis.gov/NTRL
但是,当我想更改下拉列表的年份时,它无法工作。
这是它的 HTML:
<div id="advSearchForm:FromYear" class="ui-selectonemenu ui-widget ui-state-default ui-corner-all" style="min-width: 63px;">
<div class="ui-helper-hidden-accessible">
<input id="advSearchForm:FromYear_focus" name="advSearchForm:FromYear_focus" type="text" autocomplete="off" role="combobox" aria-haspopup="true" aria-expanded="false" readonly="readonly" aria-autocomplete="list" aria-owns="advSearchForm:FromYear_items" aria-activedescendant="advSearchForm:FromYear_0" aria-describedby="advSearchForm:FromYear_0" aria-disabled="false">
</div>
<div class="ui-helper-hidden-accessible">
<select id="advSearchForm:FromYear_input" name="advSearchForm:FromYear_input" tabindex="-1">
<option value="*" selected="selected"><1900</option>
<option value="1900">1900</option>
<option value="1901">1901</option>
<option value="1902">1902</option>
<option value="1903">1903</option>
</select>
</div>
<label id="advSearchForm:FromYear_label" class="ui-selectonemenu-label ui-inputfield ui-corner-all"><1900</label>
<div class="ui-selectonemenu-trigger ui-state-default ui-corner-right">
<span class="ui-icon ui-icon-triangle-1-s ui-c"/>
</div>
</div>
这是我的代码:
select = Select(driver.find_element_by_xpath(".//div[@id='advSearchForm:FromYear']/div[2]/select"))
select.select_by_value("1902")
但它出现异常:
Element is not currently visible and may not be manipulated
我尝试使用js脚本:
driver.execute_script("document.getElementById('advSearchForm:FromYear_input').options[2].selected = 'true'")
但它也不起作用,我测试了一下select.select_by_value(xxx)
可以用在其他下拉列表上,所以可能是<div class="ui-helper-hidden-accessible">
的麻烦,那么我该如何处理呢?
最佳答案
我建议首先使用 click
事件单击该元素(id 为“advSearchForm:FromYear_input”的 Select
元素),然后单击 ExplicitWait event等到元素可见,然后您应该能够使用 select_by_value
方法更改年份。
此外,我会避免使用 XPath 并使用 CSS selector相反,更好的是创建一个 Page Object Model减少将来页面更新时保持工具正常运行所需的工作。
抱歉,我无法提供更多帮助,我对 python 不太熟悉。
您也可以引用this question.
编辑
看起来像是使用 select
内的 option
项作为主列表,而实际选择发生在页面下方的另一个元素内。该元素是用 Javascript 动态构建的,因此我在评论中的建议将不起作用。
我用 C# 编写了一个工作应用程序,让您了解需要做什么:
private static void Main(string[] args)
{
// ':' has a special meaning in CSS selectors so we need to escape it using \\
const string dropdownButtonSelector = "div#advSearchForm\\:datePublPanel div.ui-selectonemenu-trigger";
// {0} is a placeholder which is used to insert text during runtime
const string dynamicallyBuiltListItemSelectorTemplate = "ul#advSearchForm\\:FromYear_items li[data-label=\"{0}\"]";
// Rather than being a constant this value will be determined at runtime
const string valueToSelect = "1902";
// Setup driver and wait
ChromeDriver driver = new ChromeDriver();
WebDriverWait wait = new WebDriverWait(driver, TimeSpan.FromSeconds(5));
// Load page
driver.Navigate().GoToUrl("https://ntrl.ntis.gov/NTRL/");
// Wait until the first (index 0) dropdown list button inside the publication date dive is deemed "clickable"
wait.Until(ExpectedConditions.ElementToBeClickable(driver.FindElementsByCssSelector(dropdownButtonSelector)[0]));
Console.WriteLine("Element is visible");
// Open the dropdown list
driver.FindElementsByCssSelector(dropdownButtonSelector)[0].Click();
Console.WriteLine("Dropdown should be open");
// Select the element from the dynamic Javascript built list
string desiredValueListItemSelector = string.Format(dynamicallyBuiltListItemSelectorTemplate, valueToSelect);
driver.FindElementByCssSelector(desiredValueListItemSelector).Click();
Console.WriteLine($"Selected value {valueToSelect} using selector: {desiredValueListItemSelector}");
Console.ReadLine();
driver.Close();
}
================================================== ===========================
编辑2
包括 python 答案,我以前从未编写过 python,但这似乎可行。我强烈建议查看我上面发布的一些有关使用 PageObject 模型和显式等待的链接,并避免使用 XPATH 选择器。
from selenium import webdriver
from time import sleep
# Set the year to select
fromYearToSelect = "1902"
# Create the driver and load the page
driver = webdriver.Chrome("C:\chromedriver_win32\chromedriver.exe")
driver.get("https://ntrl.ntis.gov/NTRL/")
# Find and click the "From" dropdown elems[1] is the "To" dropdown
elems = driver.find_elements_by_css_selector("div#advSearchForm\\:datePublPanel div.ui-selectonemenu-trigger")
elems[0].click()
# Select the year
driver.find_element_by_css_selector("#advSearchForm\\:FromYear_items li[data-label='{0}']".format(fromYearToSelect)).click()
# Wait to see the results (we should be using an Explicit Wait here)
sleep(2)
# Close the driver
driver.close()
关于javascript - 如何通过 selenium 驱动程序从 "<div class=' ui-helper-hidden-accessible'>"中提取选项?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46481165/