java - 如何使用java的selenium html单元驱动程序读取带有无限滚动条的页面?

标签 java html selenium

例如,如果我想使用 selenium 在 facebook 上查找一年前的帖子,我如何能够向下滚动并获取文本。我已经弄清楚如何使用 Selenium 滚动,但每当我尝试获取元素或页面源时,它只包含初始加载的页面,没有向下滚动到的内容。我实际上并没有将它用于 facebook,而是将它用于一个没有 java 开发工具、股票推特的网站。

最佳答案

我在此示例中遵循的逻辑通过文本内容查找帖子

  Let Allposts

    While timeout
     Get all the currently visible posts which has text in it
      Remove Allposts from currentPosts [So that we dont need to check the same post again]
       And add currentPosts to Allposts[To maintain a list]
         For each post in currentPosts
           check if post's text contains given text
           stop
       scroll to bottom[which invokes ajax call to load more posts]
       //Replace the above with any button like LoadMore or something if scroll dint invoke ajax load
       wait till the page loaded
    do it again

这对我来说非常有效,我在我的生日那天在墙上发现了一个帖子[1个月前]。

花了 20 分钟[取决于帖子数量和帖子时间,需要更多时间]

以下内容将在您的 Facebook 新闻源中搜索给定文本

public static void fbSearch() {
    System.setProperty("webdriver.chrome.driver", "D:\\Galen\\chromedriver.exe");
    WebDriver driver = new ChromeDriver();
    driver.get("http://www.facebook.com");
    driver.findElement(By.name("email")).sendKeys("phystem");
    driver.findElement(By.name("pass")).sendKeys("yyy");
    driver.findElement(By.id("loginbutton")).click();
    waitForPageLoaded(driver);
    fbPostSearch(driver, "True Story", 20);//timeOut in Mins
}

public static Boolean fbPostSearch(WebDriver driver, String postContent, int timeOutInMins) {
    Set<WebElement> allPosts = new HashSet<>();
    int totalTime = timeOutInMins * 60000; // in millseconds
    long startTime = System.currentTimeMillis();
    boolean timeEnds = false;
    while (!timeEnds) {
        List<WebElement> posts = getPosts(driver);
        posts.removeAll(allPosts);//to remove old posts as we already searched it
        allPosts.addAll(posts);//append new posts to all posts
        for (WebElement post : posts) {
            String content = post.getText();
            if (content.contains(postContent)) {
                //this is our element
                System.out.println("Found");
                new Actions(driver).moveToElement(post).build().perform();
                ((JavascriptExecutor) driver).executeScript("arguments[0].style.outline='2px solid #ff0';", post);
                return true;
            }
        }
        scrollToBottom(driver);
        waitForPageLoaded(driver);
        timeEnds = (System.currentTimeMillis() - startTime >= totalTime);
    }
    System.out.println("Not Found");
    return false;
}

public static List<WebElement> getPosts(WebDriver driver) {
    //finding Posts which has textContent coz some posts are image only
    return driver.findElements(By.cssSelector("div._4-u2.mbm._5v3q._4-u8 div._5pbx.userContent"));
}

private static void scrollToBottom(WebDriver driver) {
    long longScrollHeight = (Long) ((JavascriptExecutor) driver).executeScript("return Math.max("
            + "document.body.scrollHeight, document.documentElement.scrollHeight,"
            + "document.body.offsetHeight, document.documentElement.offsetHeight,"
            + "document.body.clientHeight, document.documentElement.clientHeight);"
    );
    ((JavascriptExecutor) driver).executeScript("window.scrollTo(0, " + longScrollHeight + ");");
}

public static void waitForPageLoaded(WebDriver driver) {
    ExpectedCondition<Boolean> expectation = new ExpectedCondition<Boolean>() {
        @Override
        public Boolean apply(WebDriver driver) {
            return ((JavascriptExecutor) driver).executeScript(
                    "return document.readyState").equals("complete");
        }
    };
    WebDriverWait wait = new WebDriverWait(driver, 20);
    wait.until(expectation);
}

关于java - 如何使用java的selenium html单元驱动程序读取带有无限滚动条的页面?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31568035/

相关文章:

java - Swing - 计时器不按时间间隔触发事件

css - Bootstrap 2 span6

java - 即使显示复选框,单击复选框也无法使用 Selenium webdriver

azure - Selenium 和 Azure Web 应用程序

java - 为什么我的应用程序在尝试从 Firebase 检索数据时崩溃?

java - EJB 本地调用出现错误目标异常

android - Cordova 应用程序图片和布局在某些设备中拉伸(stretch),特别是 Samsung galaxy S4

html - 在电子邮件模板(或即 8)中为 outlook 2013 创建一条等方形线?

java - 如何改善appium中点击之间的时间?

java - myBatis 中如何映射一对多关系?