java - 如何使用 HTMLUnit 单击 anchor 下载 ZIP 文件

标签 java htmlunit

我正在尝试使用以下代码使用 HTMLUnit 2.32 下载 ZIP 文件。

我获得的“myfile.zip”比通过普通浏览器下载的文件大(179kb vs 79kb),并且已损坏。

如何点击 anchor 并使用 HTMLUnit 下载文件?

        WebClient wc = new WebClient(BrowserVersion.CHROME);

        final String HREF_SCARICA_CONSOLIDATI = "/web/area-pubblica/quotate?viewId=export_quotate";

        final String CONSOBBase = "http://www.consob.it";

        HtmlPage page = wc.getPage(CONSOBBase + HREF_SCARICA_CONSOLIDATI);

        final String downloadButtonXpath = "//a[contains(@href, 'javascript:downloadAzionariato()')]";
        List<HtmlAnchor> downloadAnchors = page.getByXPath(downloadButtonXpath);
        HtmlAnchor downloadAnchor = downloadAnchors.get(0);

        UnexpectedPage downloadedFile = downloadAnchor.click();

       InputStream contentAsStream = downloadedFile.getWebResponse().getContentAsStream();
        File destFile = new File("/tmp", "myfile.zip");
        Writer out = new OutputStreamWriter(new FileOutputStream(destFile));
        IOUtils.copy(contentAsStream, out);
        out.close();

最佳答案

已更新您的代码片段以使其正常工作。希望内联注释有助于理解正在发生的事情(使用 HtmlUnit 的最新 SNAPSHOT 代码 (2.34-SNAPSHOT 2018/11/03)

final String HREF_SCARICA_CONSOLIDATI = "http://www.consob.it/web/area-pubblica/quotate?viewId=export_quotate";

try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_60)) {                                   
    HtmlPage page = webClient.getPage(HREF_SCARICA_CONSOLIDATI);                                               

    final String downloadButtonXpath = "//a[contains(@href, 'javascript:downloadAzionariato()')]";             
    List<HtmlAnchor> downloadAnchors = page.getByXPath(downloadButtonXpath);                                   
    HtmlAnchor downloadAnchor = downloadAnchors.get(0);                                                        

    // click does some javascript magic - have a look at your browser                                          
    // seems like this opens a new window with the content as response                                         
    // because of this we can ignore the page returned from click                                              
    downloadAnchor.click();                                                                                    
    // instead of we are waiting a bit until the javascript is done                                            
    webClient.waitForBackgroundJavaScript(1000);                                                               

    // now we have to pick up the window/page that was opened as result of the download                        
    Page downloadPage = webClient.getCurrentWindow().getEnclosedPage();                                        

    // and finally we can save to content                                                                      
    File destFile = new File("/tmp", "myfile.zip");                                                            
    try (InputStream contentAsStream = downloadPage.getWebResponse().getContentAsStream()) {                   
        try (OutputStream out = new FileOutputStream(destFile)) {                                              
            IOUtils.copy(contentAsStream, out);                                                                
        }                                                                                                      
    }                                                                                                          

    System.out.println("Output written to " + destFile.getAbsolutePath());                                     
}                                                                                                              

关于java - 如何使用 HTMLUnit 单击 anchor 下载 ZIP 文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53126900/

相关文章:

java - (java) 使用 HtmlUnit 监听 websocket 消息

logging - 无法关闭 HtmlUnit 日志消息

java - 有没有办法一次性加载 Spring Boot 应用程序中的配置?

java - 安卓 : Menu Item always overflows in portrait

java - 模式 20+99 的正则表达式验证

Java + HtmlUnit WebClient + SSL 页面

java - HtmlUnitDriver 在获取 url 时出现问题

java - 无法使用 HtmlUnitDriver 单击 Web 元素

java - FindBugs:DMI_ENTRY_SETS_MAY_REUSE_ENTRY_OBJECTS

java - 将生成的xml存储到java中的字符串变量中