javax.net.ssl.HttpsURLConnection 返回火星诗

标签 java http ssl https

我正在编写一个简单的 https 客户端,它将通过 https 拉取网页的 html。我可以很好地连接到网页,但是我下拉的 html 是乱码。

public String GetWebPageHTTPS(String URI){
    BufferedReader read;
    URL inputURI;
    String line;
    String renderedPage = "";
    try{
        inputURI = new URL(URI);
        HttpsURLConnection connect;
        connect = (HttpsURLConnection)inputURI.openConnection();
        connect.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401");
        read = new BufferedReader (new InputStreamReader(connect.getInputStream()));
        while ((line = read.readLine()) != null)
            renderedPage += line;
        read.close();
    }
    catch (MalformedURLException e){
        e.printStackTrace();
    }
    catch (IOException e){
        e.printStackTrace();
    }
    return renderedPage;
}

当我向它传递一个类似 https://kat.ph/ 的字符串时返回大约 10,000 个乱码

编辑 这是我修改后的自签名证书代码,但我仍然得到加密流:

public String GetWebPageHTTPS(String URI){
    TrustManager[] trustAllCerts = new TrustManager[] { 
            new X509TrustManager() {     
                public java.security.cert.X509Certificate[] getAcceptedIssuers() { 
                    return null;
                } 
                public void checkClientTrusted( 
                    java.security.cert.X509Certificate[] certs, String authType) {
                    } 
                public void checkServerTrusted( 
                    java.security.cert.X509Certificate[] certs, String authType) {
                }
            } 
        }; 
        try {
            SSLContext sc = SSLContext.getInstance("SSL"); 
            sc.init(null, trustAllCerts, new java.security.SecureRandom()); 
            HttpsURLConnection.setDefaultSSLSocketFactory(sc.getSocketFactory());
        } catch (GeneralSecurityException e) {
        } 
        try { 
            System.out.println("URI: " + URI);
            URL url = new URL(URI); 
        } catch (MalformedURLException e) {
        } 
    BufferedReader read;
    URL inputURI;
    String line;
    String renderedPage = "";
    try{
        inputURI = new URL(URI);
        HttpsURLConnection connect;
        connect = (HttpsURLConnection)inputURI.openConnection();
        read = new BufferedReader (new InputStreamReader(connect.getInputStream()));
        while ((line = read.readLine()) != null)
            renderedPage += line;
        read.close();
    }
    catch (MalformedURLException e){
        e.printStackTrace();
    }
    catch (IOException e){
        e.printStackTrace();
    }
    return renderedPage;
}

最佳答案

“它是否被压缩过?stackoverflow.com/questions/8249522/…”——Mahesh Guruswamy

是的,事实证明它只是 gzip 压缩,这是我解决这个问题的方法

public String GetWebPageGzipHTTP(String URI){ 
    String html = "";
    try {
        URLConnection connect = new URL(URI).openConnection();                        
        BufferedReader in = null;
        connect.setReadTimeout(10000);
        connect.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401");
        if (connect.getHeaderField("Content-Encoding")!=null && connect.getHeaderField("Content-Encoding").equals("gzip")){
            in = new BufferedReader(new InputStreamReader(new GZIPInputStream(connect.getInputStream())));            
        } else {
            in = new BufferedReader(new InputStreamReader(connect.getInputStream()));            
        }          
        String inputLine;
        while ((inputLine = in.readLine()) != null){
        html+=inputLine;
        }
    in.close();
        return html;
    } catch (Exception e) {
        return html;
    }
}

关于javax.net.ssl.HttpsURLConnection 返回火星诗,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16611447/

相关文章:

laravel - nginx - (http/https) 非 www 到 www 重定向

http - 从桌面应用程序链接到谷歌地图

multithreading - 如何停止执行使用 HTTPClient 创建的请求

javascript - 使用 javascript 重写表单 url

java - 将用于 SSL 使用的 PFX 通配符安装到 Tomcat 中

python - 替换 urllib.request.urlopen(url, ca*) 的开启器

java - 删除无响应的 channel - java

java - 如何从 SQL 数据库中仅获取 1 个元素?

java - 如何在聊天应用程序中将新收到的消息设置为粗体

java - 在不实现所有方法的情况下扩展 Graphics2D 类