python - 滚动高度在 Selenium : [ arguments[0]. scrollHeight 中返回 "None"]

标签 python python-3.x selenium selenium-webdriver selenium-chromedriver

使用 selenium 在 python Bot 上工作,由于从“arguments[0].scrollHeight”返回“None”,对话框中的无限滚动不起作用

dialogBx=driver.find_element_by_xpath("//div[@role='dialog']/div[2]")

print(dialogBx)  #<selenium.webdriver.remote.webelement.WebElement (session="fcec89cc11fa5fa5eaf29a8efa9989f9", element="31bfd470-de78-XXXX-XXXX-ac1ffa6224c4")>
print(type(dialogBx)) #<class 'selenium.webdriver.remote.webelement.WebElement'>
sleep(5)

last_height=driver.execute_script("arguments[0].scrollHeight",dialogBx);
print("Height : ",last_height) #None

我需要最后一个高度来比较,请提出解决方案。

最佳答案

好的,回答你的问题,因为你在一个对话中,我们应该关注它。当你执行时:last_height=driver.execute_script("arguments[0].scrollHeight",dialogBx);我相信您是在主页或错误的 div 中执行的(不是 100% 确定)。无论哪种方式,我都采用了不同的方法,我们将选择最后一个 <li>对话框中当前可用的项目并向下滚动到它的位置,这将强制更新对话框。我将从您将在下面看到的完整代码中提取一段代码:

last_li_item = driver.find_element_by_xpath('/html/body/div[4]/div/div[2]/ul/div/li[{p}]'.format(p=start_pos))
last_li_item.location_once_scrolled_into_view

我们首先选择最后一个列表项,然后选择属性 location_once_scrolled_into_view。此属性会将我们的对话框向下滚动到我们的最后一个项目,然后它将加载更多项目。 start_pos 只是 <li> 列表中的位置我们有可用的元素。即:<div><li></li><li></li><li></li></div> start_pos=2 这是从 0 开始的最后一个 li 项。我把这个变量名放在一个 for 循环中,它正在观察 div 中 li 项的变化,一旦你看到完整的代码,你就会得到它。

另一方面,要执行此操作,只需更改顶部的参数并执行测试函数 test()。如果您已经登录到 instagram,您只需运行 get_list_of_followers()。 注意:使用此功能使用也在此代码中的 Follower 类。如果你愿意,你可以删除,但你需要修改函数。 重要提示:

当您执行此程序时,对话框项目将增加,直到没有更多项目可加载,因此 TODO 将删除您已经处理的元素,否则我相信当您开始使用大数字时性能会变慢!

如果您需要任何其他解释,请告诉我。现在代码:

import time
from selenium import webdriver
from selenium.webdriver.remote.webelement import WebElement

# instagram url as our base
base_url = "https://www.instagram.com"
# =====================MODIFY THESE TO YOUR NEED=========
# the user we wish to get the followers from
base_user = "/nasa/"
# how much do you wish to sleep to wait for loading (seconds)
sleep_time = 3
# True will attempt login with facebook, False with instagram
login_with_facebook = True
# Credentials here
username = "YOUR_USERNAME"
password = "YOUR_PASSWORD"

# How many users do you wish to retrieve? -1 = all or n>0
get_users = 10
#==========================================================
# This is the div that contains all the followers info not the dialog box itself
dialog_box_xpath = '/html/body/div[4]/div/div[2]/ul/div'

total_followers_xpath = '/html/body/div[1]/section/main/div/header/section/ul/li[2]/a/span'
followers_button_xpath = '/html/body/div[1]/section/main/div/header/section/ul/li[2]/a'
insta_username_xpath = '/html/body/div[5]/div/div[2]/div[2]/div/div/div[1]/div/form/div[2]/div/label/input'
insta_pwd_xpath = '/html/body/div[5]/div/div[2]/div[2]/div/div/div[1]/div/form/div[3]/div/label/input'
insta_login_button_xpath = '/html/body/div[5]/div/div[2]/div[2]/div/div/div[1]/div/form/div[4]/button'
insta_fb_login_button_xpath = '/html/body/div[5]/div/div[2]/div[2]/div/div/div[1]/div/form/div[6]/button'

fb_username_xpath = '/html/body/div[1]/div[3]/div[1]/div/div/div[2]/div[1]/form/div/div[1]/input'
fb_pwd_xpath = '/html/body/div[1]/div[3]/div[1]/div/div/div[2]/div[1]/form/div/div[2]/input'
fb_login_button_xpath = '/html/body/div[1]/div[3]/div[1]/div/div/div[2]/div[1]/form/div/div[3]/button'

u_path = fb_username_xpath if login_with_facebook else insta_username_xpath
p_path = fb_pwd_xpath if login_with_facebook else insta_pwd_xpath
lb_path = fb_login_button_xpath if login_with_facebook else insta_login_button_xpath


# Simple class of a follower, you dont actually need this but for explanation is ok.
class Follower:
    def __init__(self, user_name, href):
        self.username = user_name
        self.href = href

    @property
    def get_username(self):
        return self.username

    @property
    def get_href(self):
        return self.href

    def __repr__(self):
        return self.username


def test():
    base_user_path = base_url + base_user
    driver = webdriver.Chrome()
    driver.get(base_user_path)

    # click the followers button and will ask for login
    driver.find_element_by_xpath(followers_button_xpath).click()
    time.sleep(sleep_time)

    # now we decide if we will login with facebook or instagram
    if login_with_facebook:
        driver.find_element_by_xpath(insta_fb_login_button_xpath).click()
        time.sleep(sleep_time)
    username_input = driver.find_element_by_xpath(u_path)
    username_input.send_keys(username)
    password_input = driver.find_element_by_xpath(p_path)
    password_input.send_keys(password)
    driver.find_element_by_xpath(lb_path).click()
    # We need to wait a little longer for the page to load so. Feel free to change this to your needs.
    time.sleep(10)
    # click the followers button again
    driver.find_element_by_xpath(followers_button_xpath).click()
    time.sleep(sleep_time)

    # now we get the list of followers from the dialog box. This function will return a list of follower objects.
    followers: list[Follower] = get_list_of_followers(driver, dialog_box_xpath, get_users)
    # close the driver we do not need it anymore.
    driver.close()
    for follower in followers:
        print(follower, follower.get_href)


def get_list_of_followers(driver, d_xpath=dialog_box_xpath, get_items=10):
    """
    Get a list of followers from instagram
    :param driver: driver instance
    :param d_xpath: dialog box xpath. By default it gets the global parameter but you can change it
    :param get_items: how many items do you wish to obtain? -1 = Try to get all of them. Any positive number will be
    = the number of followers to obtain
    :return: list of follower objects
    """
    # getting the dialog content element
    dialog_box: WebElement = driver.find_element_by_xpath(d_xpath)
    # getting all the list items (<li></li>) inside the dialog box.
    dialog_content: list[WebElement] = dialog_box.find_elements_by_tag_name("li")
    # Get the total number of followers. since we get a string we need to convert to int by int(<str>)
    total_followers = int(driver.find_element_by_xpath('/html/body/div[1]/section/main/div/header/section/ul/li['
                                                       '2]/a/span').get_attribute("title").replace(".",""))
    # how many items we have without scrolling down?
    li_items = len(dialog_content)
    # We are trying to get n elements (n=get_items variable). Now we need to check if there are enough followers to
    # retrieve from if not we will get the max quantity of following. This applies only if n is >=0. If -1 then the
    # total amount of followers is n
    if get_items == -1:
        get_items = total_followers
    elif -1 < get_items <= total_followers:
        # no need to change anything, git is ok to work with get_items
        pass
    else:
        # if it not -1 and not between 0 and total followers then we raise an error
        raise IndexError

    # You can start from greater than 0 but that will give you a shorter list of followers than what you wish if
    # there is not enough followers available. i.e: total_followers = 10, get_items=10, start_from=1. This will only
    # return 9 followers not 10 even if get_items is 10.
    return generate_followers(0, get_items, total_followers, dialog_box, driver)


def generate_followers(start_pos, get_items, total_followers, dialog_box_element: WebElement, driver):
    """
    Generate followers based on the parameters
    :param start_pos: index of where to start getting the followers from
    :param get_items: total items to get
    :param total_followers = total number of followers
    :param dialog_box_element: dialog box to get the list items count
    :param driver: driver object
    :return: followers list
    """
    if -1 < start_pos < total_followers:
        # we want to count items from our current position until the last element available without scrolling. We do
        # it this way so when we scroll down, the list items will be greater but we will start generating followers
        # from our last current position not from the beginning!
        first = dialog_box_element.find_element_by_xpath("./li[{pos}]".format(pos=start_pos+1))
        li_items = dialog_box_element.find_elements_by_xpath("./li[position()={pos}][last("
                                                             ")]/following-sibling::li"
                                                             .format(pos=(start_pos + 1)))
        li_items.insert(0, first)
        print("Generating followers from position position: {pos} with {li_count} list items"
              .format(pos=(start_pos+1), li_count=len(li_items)))
        followers = []
        for i in range(len(li_items)):
            anchors = li_items[i].find_elements_by_tag_name("a")
            anchor = anchors[0] if len(anchors) ==1 else anchors[1]
            follower = Follower(anchor.text, anchor.get_attribute(
                "href"))
            followers.append(follower)
            get_items -= 1
            start_pos += 1
            print("Follower {f} added to the list".format(f=follower))
            # we break the loop if our starting position is greater than 0 or if get_items has reached 0 (means if we
            # request 10 items we got them all no need to continue
            if start_pos >= total_followers or get_items == 0:
                print("finished")
                return followers
        print("finished loop, executing scroll down...")
        last_li_item = driver.find_element_by_xpath('/html/body/div[4]/div/div[2]/ul/div/li[{p}]'.format(p=start_pos))
        last_li_item.location_once_scrolled_into_view
        time.sleep(sleep_time)
        followers.extend(generate_followers(start_pos, get_items, total_followers, dialog_box_element, driver))
        return followers
    else:
        raise IndexError

关于python - 滚动高度在 Selenium : [ arguments[0]. scrollHeight 中返回 "None"],我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61705729/

相关文章:

python - 获取词向量矩阵中最相似的词

python - 如何在一天中的特定时间在后台运行我的程序?

python-3.x - 如何使用逗号分隔的值列创建虚拟变量?

python - get() 接受 2 到 3 个位置参数,但给出了 4 个。为什么会出现这个错误?解决这个问题的办法是什么?

类属性中的 Javascript ES6 常量未定义

java - 使用 Tomcat 服务器从 Servlet 启动 Selenium Web 驱动程序时出错

python - 在数据框中搜索子字符串并替换它

python - numpy loadtxt 不会产生数组

python - 将文本锚定或锁定到 Matplotlib 中的标记

java - 如何在 Java selenium 的 POM-TestNG 类中使用 SELECT 语句