python - Selenium webdriver 从 find_elements_by_X 返回空列表

我的目标是获取所有已在 https://www.prusaprinters.org/prints 上发布的新项目的名称列表在给定一天的 24 小时内。

通过一些阅读，我了解到我应该使用 Selenium，因为我抓取的网站是动态的(在用户滚动时加载更多对象)。

问题是，我似乎无法从 webdriver.find_elements_by_ 中得到一个空列表，其中任何后缀都列在 https://selenium-python.readthedocs.io/locating-elements.html 中。 .

在网站上，当我检查要获取标题的元素时，我看到 "class = name" 和 "class = clamp-two-lines" (见屏幕截图)，但我似乎无法返回页面上所有元素的列表，其中包含该 name 类或 clamp-two-lines 类。

这是我目前的代码(注释掉的行是失败的尝试):

from timeit import default_timer as timer
start_time = timer()
print("Script Started")

import bs4, selenium, smtplib, time
from bs4 import BeautifulSoup 
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver = webdriver.Chrome(r'D:\PortableApps\Python Peripherals\chromedriver.exe')

url = 'https://www.prusaprinters.org/prints'
driver.get(url)
# foo = driver.find_elements_by_name('name')
# foo = driver.find_elements_by_xpath('name')
# foo = driver.find_elements_by_class_name('name')
# foo = driver.find_elements_by_tag_name('name')
# foo = [i.get_attribute('href') for i in driver.find_elements_by_css_selector('[id*=name]')]
# foo = [i.get_attribute('href') for i in driver.find_elements_by_css_selector('[class*=name]')]
# foo = [i.get_attribute('href') for i in driver.find_elements_by_css_selector('[id*=clamp-two-lines]')]
# foo = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, '//*[@id="printListOuter"]//ul[@class="clamp-two-lines"]/li')))
print(foo)
driver.quit()

print("Time to run: " + str(round(timer() - start_time,4)) + "s")

我的研究:

最佳答案

要获取文本，请等待元素的可见性。标题的 CSS 选择器是 #printListOuter h3:

titles = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, '#printListOuter h3')))

for title in titles:
    print(title.text)

较短的版本:

wait = WebDriverWait(driver, 10)
titles = [title.text for title in wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, '#printListOuter h3')))]

关于python - Selenium webdriver 从 find_elements_by_X 返回空列表，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59868524/

python - Selenium webdriver 从 find_elements_by_X 返回空列表

上一篇：javascript - 通过值查找条目键的最佳方法

下一篇：asp.net - 发送带有显式 Samesite=none 的表单例份验证 Cookie