java - URL 扫描器断行

我有一个网站，我想阅读其中的一些内容。

我正在使用扫描仪，但它总是在读取整行之前中断行。

这是我的代码:

URL url = new URL("http://whereisthemonkey.weebly.com/better-mob-ai.html");
        InputStream inputStream = url.openStream();

        Scanner scanner = new Scanner(inputStream, "UTF-8");
        //scanner.useDelimiter("\\n");
        while(scanner.hasNext()){
            String line = scanner.nextLine();
            if(line.startsWith("<meta property=\"og:description\" content=\"I nformation")){
                line = line.replace(" ", "").replace("┬", "").replace("á", "");
                System.out.println(line);
                line = line.substring(line.indexOf("Status:") + 7, line.indexOf("Status:") + 12);

                int latestVersion = Integer.valueOf(line);
                if(latestVersion == 0){
                    scanner.close();
                    inputStream.close();
                    System.err.println("/=============================================================================\\");
                    System.err.println("|[Better MobAI] The developing team of Better MobAI encountered a major error:|");
                    System.err.println("|[Better MobAI] The plugin will be therefore disabled!                        |");  
                    System.err.println("\\============================================================================/");
                    return false;
                }
                if(latestVersion == 1){
                    scanner.close();
                    inputStream.close();
                    return true;
                }
            }
        }
        scanner.close();
        inputStream.close();

有谁知道我做错了什么，因为这是我得到的输出:

<metaproperty="og:description"content="InformationááááááááááááááááCurrentversion:1.9áááááááááááááááááááááááááááááááááááááááá..."/>

谢谢!

最佳答案

首先:我从您的网站获取所有 HTML 内容，如下所示:

昨天，我只找到了一个“状态”一词。因此，您在 if-statement 中的条件不正确，因为该单词不存在于您的条件 startsWith 行中。

今天，(网站更新)我发现了两个“状态”字样。因此，您在 if-statement 中的条件是正确的，其中包含该单词的行。您可以将 endIndex 更改为 line.indexOf("Status:") + 8。另一个“状态”单词将被忽略，因为您的条件 latestVersion == __ 为 true，然后 return 并打破循环。

但是等等..这对我来说很不舒服，因为网站每次都会刷新。所以，你的条件不可能正常工作。

因此，我建议您对其读取的每一行使用string.contains("Status");。像这样:

public static boolean latestVersion() throws Exception {
    URL url = new URL("http://whereisthemonkey.weebly.com/better-mob-ai.html");
    InputStream inputStream = url.openStream();

    Scanner scanner = new Scanner(inputStream, "UTF-8");
    int numLine = 0;
    while (scanner.hasNext()) {
        String line = scanner.nextLine();
        numLine++;
        String status = "-1"; // equal any number like -1 which Status will never equal it
        if (line.contains("Status")) {
            int indexOfStatus = line.indexOf("Status");
            status = line.substring(indexOfStatus + 7, indexOfStatus + 9);
            System.out.println("line " + numLine + ": contains Status word | Status = " + status);
        }

        // use trim to avoid any spaces
        int latestVersion = Integer.parseInt(status.trim());
        if (latestVersion == 0) {
            scanner.close();
            inputStream.close();
            System.err.println("/=============================================================================\\");
            System.err.println("|[Better MobAI] The developing team of Better MobAI encountered a major error:|");
            System.err.println("|[Better MobAI] The plugin will be therefore disabled! |");
            System.err.println("\\============================================================================/");
            return false;
        }
        if (latestVersion == 1) {
            System.out.println("latestVersion: " + latestVersion);
            scanner.close();
            inputStream.close();
            return true;
        }
    }
    scanner.close();
    inputStream.close();
    return false;
}

<小时/>

提示:任何与互联网的连接都使用Thread来确保您的数据被全部下载，这可能需要很长时间。

关于java - URL 扫描器断行，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/36992354/

java - URL 扫描器断行

上一篇：java - DBCP 池出现错误 : "java.sql.SQLException: Configuration file not found"

下一篇：java - 计算指数增长系列中的值之和