我正在尝试验证 PDF 中的内容,我使用 href 获取 URL 并将其传递到下面的代码中。 URL 使用 HTTPS,所以我面临以下问题。任何人都可以帮助我如何继续并帮助我阅读 pdf 数据。提前致谢
重试的网址为https://XXXXXXXXXXXXXXXXX/XXXX/XXXXXXXXXXXX?docType=pdf&docid=2229123
URL PDFUrl = new URL(url);
BufferedInputStream TestFile = new BufferedInputStream(PDFUrl.openStream());
PDFParser TestPDF = new PDFParser((RandomAccessRead) TestFile);
TestPDF.parse();
String TestText = new PDFTextStripper().getText(TestPDF.getPDDocument());
System.out.println("Document Text is "+ TestText);
错误是
java.net.ConnectException: Connection timed out: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at sun.security.ssl.SSLSocketImpl.connect(Unknown Source)
at sun.security.ssl.BaseSSLSocketImpl.connect(Unknown Source)
at sun.net.NetworkClient.doConnect(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.protocol.https.HttpsClient.<init>(Unknown Source)
at sun.net.www.protocol.https.HttpsClient.New(Unknown Source)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source)
at java.net.URL.openStream(Unknown Source)
最佳答案
您是否在驱动程序所需的功能中设置接受 SSL 证书?
DesiredCapabilities dc = DesiredCapabilities.chrome ()
dc.setCapability (CapabilityType.ACCEPT_SSL_CERTS, true)
WebDriver driver = new ChromeDriver (dc);
关于java - 如何在selenium中读取PDF内容,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60630496/