python - 处理加载缓慢的网页，从我的脚本中消除硬编码延迟

我用 python 编写了一个与 selenium 相关的脚本，用于解析处理延迟加载方法的网页中的一些名称，该网页在每次滚动到底部时显示其内容。我的脚本没有错误地完成了它。然而，我无法解决的唯一问题是从我的脚本中删除硬编码的延迟。我真的找不到任何关于如何使用显式等待而不是硬编码延迟来保持逻辑(在脚本内应用)的想法，因为它是为了使它更高效的。预先感谢您的帮助。

Webpage link

这是我到目前为止所尝试过的(有效):

import time
from selenium import webdriver

driver = webdriver.Chrome()
driver.get("find_the_link_above")

last_len = len(driver.find_elements_by_class_name("listing__name--link"))
new_len = last_len

while True:
    last_len = new_len
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    time.sleep(3) ##I wish to kick out this harcoded delay and use explicit wait in place

    items = driver.find_elements_by_class_name("listing__name--link")
    new_len = len(items)
    if last_len == new_len:break

for item in items:
    print(item.text)
driver.quit()

最佳答案

这是实现 ExplicitWait 的方式:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.common.exceptions import TimeoutException

driver = webdriver.Chrome()
driver.get("https://www.yellowpages.ca/search/si/1/coffee/all%20states")

last_len = len(driver.find_elements_by_class_name("listing__name--link"))

while True:
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    try:
        wait(driver, 3).until(lambda driver: len(driver.find_elements_by_class_name("listing__name--link")) > last_len)
        items = driver.find_elements_by_class_name("listing__name--link")
        last_len = len(items)
    except TimeoutException:
        break

for item in items:
    print(item.text)
driver.quit()

这应该允许您向下滚动并等待最多 3 秒(如果需要，增加超时)直到元素数量在循环中增加，或者在数量保持不变的情况下中断 while 循环

关于python - 处理加载缓慢的网页，从我的脚本中消除硬编码延迟，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50162457/

python - 处理加载缓慢的网页，从我的脚本中消除硬编码延迟

上一篇：python - 为什么某些Python代码的包装器和包装函数是相同的。

下一篇：python - 获取按钮名称作为输入