我有一个 ArrayList,其中包含以下格式的网站列表:-
- google.com
- facebook.com
- youtube.com
- yahoo.com
- wikipedia.org
t.co
我必须从所有链接中读取 html 文本。但有些链接会产生问题,例如 (t.co),而其他链接则工作正常。
代码:-
try { String line="t.co"; String[] Add_words = line.split("[//:.]"); if (Add_words[0].contains("http")) { } else if (Add_words[0].contains("www")) line = "http://" + line; else if (!Add_words[0].contains("http") && !Add_words[0].contains("www")) line = "http://www." + line; URL url = new URL(line); URLConnection urlConnection = url.openConnection(); HttpURLConnection connection = null; if(urlConnection instanceof HttpURLConnection) { connection = (HttpURLConnection) urlConnection; } else { System.out.println("Please enter an HTTP URL."); return; } BufferedReader in = new BufferedReader( new InputStreamReader(connection.getInputStream())); String urlString = ""; String current; while((current = in.readLine()) != null) { urlString += current+"\n"; } System.out.println(urlString); }catch(IOException e) { e.printStackTrace(); } And I'm getting the error with the last link `t.co`
错误:-
java.io.FileNotFoundException: http://www.t.co at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1834) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1439) at com.test.code.Main.main(Main.java:109)
我需要的是,我有上述格式的链接列表,并且我的代码应该访问所有链接,无论链接格式是什么。
最佳答案
您正在将 www.
添加到 t.co
,但 www.t.co
不正确,将导致 404 未找到
。
只需不要将 www.
添加到 URL 中即可。
关于java - java中的URL连接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25521715/