我刚学Java;请忍受我目前缺乏知识。我正在尝试编写一个程序来从各种网站上抓取公开可用的数据。我正在使用一个简单的 Java 类来开始开发这个网络抓取工具(请参阅下面的 SimpleWebScraper 类)。此类适用于某些网站(例如 https://www.codetriage.com/ ),但其他网站(例如 https://forum.mrmoneymustache.com/ask-a-mustachian/ )会返回 SSL 握手致命警报(见下文)。
我在 macOS V10.14.6、Java SE 运行时环境(内部版本 1.7.0_51-b13)、Java HotSpot 64 位服务器虚拟机(内部版本 24.51-b03,混合模式)上运行,并使用 IntelliJ IDEA 2019.2 社区版(运行时版本 11.0.3+12-b304.10 x86_64)。
能否请您建议我应该对此代码或我的操作环境进行更改以解决此致命警报消息?
预先感谢您的帮助。
简单的网页抓取工具:
import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
public class SimpleWebScraper {
public static void main(String[] args)
{
// Determine the number of command line arguments passed to the program...
int numberOfCommandLineArguments = args.length;
if (numberOfCommandLineArguments < 1) {
System.out.println("\nYou failed to include the URL for the web site you want scraped in the command line.\n");
} else {
// Initialize the string that holds the URL for the web site we want to scrape
String webSiteURL = args[0];
System.out.printf("\nTrying to retrieve the title from the URL: %s\n", webSiteURL);
try {
// Create a document object and use JSoup to fetch the website
Document webSiteDocument = Jsoup.connect(webSiteURL).get();
// Use JSoup's title() method to fetch the title
System.out.printf("\tTitle: %s\n\n", webSiteDocument.title());
// In case of any IO errors, we want the messages written to the console
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
致命警报消息:
Trying to retrieve the title from the URL: https://forum.mrmoneymustache.com/ask-a-mustachian/
javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.Alerts.getSSLException(Alerts.java:154)
at sun.security.ssl.SSLSocketImpl.recvAlert(SSLSocketImpl.java:1959)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1077)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:563)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:746)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:722)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:306)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:295)
at SimpleWebScraper.main(SimpleWebScraper.java:22)
Process finished with exit code 0
最佳答案
我将JDK更新到v1.8,将需要的macOS环境变量修改为指向JDK的V1.8,并设置IntelliJ IDEA IDE的参数以使用V1.8 JDK。总之,这些更改消除了错误消息,我在问题中包含的代码现在可以按设计运行。 :-)
关于java - 我在简单的 Web 抓取工具中遇到 SSL 握手 fatal error ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57452786/