我正在使用 ajax google API 来搜索针对 google 的字符串。它返回给我所有包含文本的所有标签的 HTML 文件。
如果我只想获取文本,我应该使用什么?
我的程序是用 Java 编写的。
问候
曼乔特
最佳答案
我做了一些谷歌搜索,发现了这个:
http://www.ajaxlines.com/ajax/stuff/article/using_google_is_ajax_search_api_with_java.php
这是那里的示例代码片段:
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
import java.net.URLEncoder;
import org.json.JSONArray; // JSON library from http://www.json.org/java/
import org.json.JSONObject;
public class GoogleQuery {
// Put your website here
private final String HTTP_REFERER = "http://www.example.com/";
public GoogleQuery() {
makeQuery("questio verum");
makeQuery("info:http://frankmccown.blogspot.com/");
makeQuery("site:frankmccown.blogspot.com");
}
private void makeQuery(String query) {
System.out.println(" Querying for " + query);
try
{
// Convert spaces to +, etc. to make a valid URL
query = URLEncoder.encode(query, "UTF-8");
URL url = new URL("http://ajax.googleapis.com/ajax/services/search/web?start=0&rsz=large&v=1.0&q=" + query);
URLConnection connection = url.openConnection();
connection.addRequestProperty("Referer", HTTP_REFERER);
// Get the JSON response
String line;
StringBuilder builder = new StringBuilder();
BufferedReader reader = new BufferedReader(
new InputStreamReader(connection.getInputStream()));
while((line = reader.readLine()) != null) {
builder.append(line);
}
String response = builder.toString();
JSONObject json = new JSONObject(response);
System.out.println("Total results = " +
json.getJSONObject("responseData")
.getJSONObject("cursor")
.getString("estimatedResultCount"));
JSONArray ja = json.getJSONObject("responseData")
.getJSONArray("results");
System.out.println(" Results:");
for (int i = 0; i < ja.length(); i++) {
System.out.print((i+1) + ". ");
JSONObject j = ja.getJSONObject(i);
System.out.println(j.getString("titleNoFormatting"));
System.out.println(j.getString("url"));
}
}
catch (Exception e) {
System.err.println("Something went wrong...");
e.printStackTrace();
}
}
public static void main(String args[]) {
new GoogleQuery();
}
}
作为旁注,您应该注意不要违反 Google TOS:
“您明确同意不通过任何自动化方式(包括使用脚本或网络爬虫)访问(或试图访问)任何服务,并应确保您遵守任何 robots.txt 中规定的说明”
- http://www.google.com/accounts/TOS
关于java - 仅使用 Java 搜索文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1546446/