我正在尝试从 Chrome 的书签文件中提取我的书签,该文件以 json 格式存储。我有大量的书签。下面的示例文件是一个新的 Google 配置文件,用于为可用文件创建少量元素。
到目前为止,我已经构建了一个 java 程序来迭代文件并提取 key 。我的问题是一些键是数组并且项目中有多个键。目前我正在尝试弄清楚如何获取这些单独的元素。
每个书签都用唯一的 ID 进行标识。因此,如果我可以通过 ID 获取 key ,然后将 key 的每个其他元素与该唯一 ID 相关联,我相信我将拥有每个书签。
我的最终目标是将书签放入数据库中,以便我可以更好地组织它们,例如搜索、查找重复项、分类和添加评论等。
我的java程序如下。 java 程序下方还有在附加的 Chrome 书签文件上运行该程序后的输出。
书签文件:
Bookmarks.json:
{
"checksum": "d27be6b28b9a8879c2cb9ba6fc90df21",
"roots": {
"bookmark_bar": {
"children": [ {
"date_added": "13081990058553125",
"id": "7",
"meta_info": {
"stars.id": "ssc_c257c6390425956c",
"stars.version": "sync.server.Chrome45"
},
"name": "Google",
"sync_transaction_version": "1",
"type": "url",
"url": "https://www.google.com/"
}, {
"date_added": "13078166246742000",
"id": "9",
"meta_info": {
"stars.flags": "5",
"stars.id": "ssc_7150b291c6b52a37",
"stars.pageData": "Ig5keGVLcUJvcW5kTjZSTQ==",
"stars.type": "2"
},
"name": "Apollo III Communications",
"sync_transaction_version": "1",
"type": "url",
"url": "http://www.apollo3.com/"
} ],
"date_added": "13113606994595146",
"date_modified": "13083379523340359",
"id": "1",
"name": "Bookmarks bar",
"type": "folder"
},
"other": {
"children": [ ],
"date_added": "13113606994595154",
"date_modified": "0",
"id": "2",
"name": "Other bookmarks",
"type": "folder"
},
"sync_transaction_version": "5",
"synced": {
"children": [ ],
"date_added": "13113606994595157",
"date_modified": "0",
"id": "3",
"name": "Mobile bookmarks",
"type": "folder"
}
},
"version": 1
}
用于迭代并提取书签的 Java 程序:
getChromeBookmarks.java
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;
import org.json.simple.parser.ParseException;
public class getChromeBookmarks {
@SuppressWarnings("resource")
public static void main(String[] args) {
String infile = "/home/users/l/j/ljames/work/json/Bookmarks.json";
String content = null;
try {
content = new Scanner(new File(infile)).useDelimiter("\\Z").next();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
JSONParser parser = new JSONParser();
try {
JSONObject json = (JSONObject) parser.parse(content);
printJsonObject(json);
} catch (ParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
public static void printJsonObject(JSONObject jsonObj) {
for (Object key : jsonObj.keySet()) {
// based on the key types
String keyStr = (String) key;
Object keyvalue = jsonObj.get(keyStr);
// Print key and value
System.out.println("key: " + keyStr + " value: " + keyvalue);
// expand(keyvalue);
// for nested objects iteration if required
if (keyvalue instanceof JSONObject)
printJsonObject((JSONObject) keyvalue);
}
}
}
java/jason 程序的输出:
key: checksum value: d27be6b28b9a8879c2cb9ba6fc90df21
key: roots value: {"other":{"date_added":"13113606994595154","date_modified":"0","children":[],"name":"Other bookmarks","id":"2","type":"folder"},"synced":{"date_added":"13113606994595157","date_modified":"0","children":[],"name":"Mobile bookmarks","id":"3","type":"folder"},"bookmark_bar":{"date_added":"13113606994595146","date_modified":"13083379523340359","children":[{"date_added":"13081990058553125","meta_info":{"stars.id":"ssc_c257c6390425956c","stars.version":"sync.server.Chrome45"},"name":"Google","id":"7","type":"url","url":"https:\/\/www.google.com\/","sync_transaction_version":"1"},{"date_added":"13078166246742000","meta_info":{"stars.pageData":"Ig5keGVLcUJvcW5kTjZSTQ==","stars.id":"ssc_7150b291c6b52a37","stars.type":"2","stars.flags":"5"},"name":"Apollo III Communications","id":"9","type":"url","url":"http:\/\/www.apollo3.com\/","sync_transaction_version":"1"}],"name":"Bookmarks bar","id":"1","type":"folder"},"sync_transaction_version":"5"}
key: other value: {"date_added":"13113606994595154","date_modified":"0","children":[],"name":"Other bookmarks","id":"2","type":"folder"}
key: date_added value: 13113606994595154
key: date_modified value: 0
key: children value: []
key: name value: Other bookmarks
key: id value: 2
key: type value: folder
key: synced value: {"date_added":"13113606994595157","date_modified":"0","children":[],"name":"Mobile bookmarks","id":"3","type":"folder"}
key: date_added value: 13113606994595157
key: date_modified value: 0
key: children value: []
key: name value: Mobile bookmarks
key: id value: 3
key: type value: folder
key: bookmark_bar value: {"date_added":"13113606994595146","date_modified":"13083379523340359","children":[{"date_added":"13081990058553125","meta_info":{"stars.id":"ssc_c257c6390425956c","stars.version":"sync.server.Chrome45"},"name":"Google","id":"7","type":"url","url":"https:\/\/www.google.com\/","sync_transaction_version":"1"},{"date_added":"13078166246742000","meta_info":{"stars.pageData":"Ig5keGVLcUJvcW5kTjZSTQ==","stars.id":"ssc_7150b291c6b52a37","stars.type":"2","stars.flags":"5"},"name":"Apollo III Communications","id":"9","type":"url","url":"http:\/\/www.apollo3.com\/","sync_transaction_version":"1"}],"name":"Bookmarks bar","id":"1","type":"folder"}
key: date_added value: 13113606994595146
key: date_modified value: 13083379523340359
key: children value: [{"date_added":"13081990058553125","meta_info":{"stars.id":"ssc_c257c6390425956c","stars.version":"sync.server.Chrome45"},"name":"Google","id":"7","type":"url","url":"https:\/\/www.google.com\/","sync_transaction_version":"1"},{"date_added":"13078166246742000","meta_info":{"stars.pageData":"Ig5keGVLcUJvcW5kTjZSTQ==","stars.id":"ssc_7150b291c6b52a37","stars.type":"2","stars.flags":"5"},"name":"Apollo III Communications","id":"9","type":"url","url":"http:\/\/www.apollo3.com\/","sync_transaction_version":"1"}]
key: name value: Bookmarks bar
key: id value: 1
key: type value: folder
key: sync_transaction_version value: 5
key: version value: 1
更新:这是我正在尝试执行的示例,代码来自:
https://stackoverflow.com/a/40887240/1204365
import java.io.FileReader;
import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;
public class Bookmark {
private static String jsonFile = "/home/users/l/j/ljames/.config/google-chrome/Default/Bookmarks";
public static void main(String[] args) {
// TODO Auto-generated method stub
FileReader reader = new FileReader(jsonFile); // access the file
JSONObject jsonObject = (JSONObject) new JSONParser().parse(reader);
String checksum = jsonObject.optString("checksum");
// get root object
JSONObject root = jsonObject.getJSONObject("roots");
// get root bookmarks object from root
JSONObject bookmarks = root.getJSONObject("bookmark_bar");
// get root children array from bookmarks
JSONArray childrens = bookmarks.getJSONArray("children");
JSONObject temp;
for (int i = 0; i < childrens.size(); i++) {
// get object using index from childrens array
temp = childrens.getJSONObject(i);
// get url
String url = temp.optString("url");
}
}
}
输出/错误是:
check: b8b257094128d165d7ccc70d0498cc87
Exception in thread "main" java.lang.ClassCastException: org.json.simple.JSONObject cannot be cast to org.json.simple.JSONArray
at javaTools.JsonParser.main(JsonParser.java:27)
Eclipse中有六个红色标记。它们位于下面几行。每行的建议修复与第一行相同...(以粗体文本突出显示):
Line 19: String checksum = jsonObject.optString("checksum"); Error: Suggesting: Change to 'toJSONString(..)' Add cast to 'temp' Rename in file (Ctrl+2 R) This same suggestion is repleated for the other five red error marks. Line 22: JSONObject root = jsonObject.getJSONObject("roots"); Line 25: JSONObject bookmarks = root.getJSONObject("bookmark_bar"); Line 28: JSONArray childrens = bookmarks.getJSONArray("children"); Line 33: temp = childrens.getJSONObject(i); Line 36: String url = temp.optString("url");
最佳答案
解析所有URL
链接需要嵌套遍历,意味着数组和对象内部可以嵌套数组。
方法:
1.) 我们将获取 root
元素内的所有键并遍历它们,因此首先解析对象
try {
jsonObject = (JSONObject) new JSONParser().parse(reader);
} catch (IOException | ParseException e) {
e.printStackTrace();
}
然后从所需的父元素(即
root
)中获取所有键,并使用foreach
循环遍历它们。JSONObject root = (JSONObject) jsonObject.get("roots"); // fetch all keys using keyset Set<String> set = root.keySet(); // traverse all keys using foreach loop for (String string : set) {
2.) 在遍历时,我们只是尝试将 key 转换为 jsonobject
,如果 key 只是我们 json
文件中的一个字符串,那么将会引发异常 string无法转换为 JSONObject,但会捕获
,所以不用担心
for (String string : set) {
try {
obj = (JSONObject) root.get(string);
} catch (ClassCastException e) {
// no need to do anything here
}
}
3.) 如果它是一个 JSONObject
,那么我们只需尝试查找实际包含我们的 url 链接的 children
数组
if (obj.containsKey("children")) {
try {
childrens = (JSONArray) obj.get("children");
// call to recursive function to find nested children array
//and print url
printUrls(childrens);
} catch (Exception e) {
// try-catch to handle any unexpected case
}
}
4.) 现在是嵌套数组部分,因为任何子级都可以包含嵌套的 children
数组,所以我应用了递归的概念来查找和获取嵌套数组的内容
public static void printUrls(JSONArray childrens) {
JSONObject temp = null;
for (int i = 0; i < childrens.size(); i++) {
// get object using index from children array
temp = (JSONObject) childrens.get(i);
// check if it contains any nested children array key
// if yes then , fetch the nested children array and call this funtion
// again to print it's content
if (temp.containsKey("children")) {
printUrls((JSONArray) temp.get("children"));
}
// fetch and print the url , most wanted guy here
String url = (String) temp.get("url");
if (url != null) {
// display the url using print
System.out.println(url);
// count is a variable which will be incremented when any url found
// and total of found urls , will be displayed at the end of parsing
count++;
}
}
}
Org.JSON jar link :单击链接中的下载 jar
选项和/或将其添加为项目中的依赖项 jar
代码
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.Objects;
import java.util.Optional;
import java.util.Set;
import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;
import org.json.simple.parser.ParseException;
// parsing code using json.simple
public class Test2 {
// path to your file
private static String jsonFile = "C:\\bookmarks.json";
static int count = 0;
public static void main(String[] args) {
// a file reader class to access the file using string file path
FileReader reader = null;
try {
reader = new FileReader(jsonFile);
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} // access the file
JSONObject jsonObject = null;
try {
jsonObject = (JSONObject) new JSONParser().parse(reader);
} catch (IOException | ParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
String checksum = (String) jsonObject.get("checksum");
JSONObject root = (JSONObject) jsonObject.get("roots");
Set<String> set = root.keySet();
JSONArray childrens = null;
JSONObject obj = null;
for (String string : set) {
try {
obj = (JSONObject) root.get(string);
} catch (ClassCastException e) {
}
if (obj.containsKey("children")) {
try {
childrens = (JSONArray) obj.get("children");
printUrls(childrens);
} catch (Exception e) {
}
}
}
// display , how many urls we have found
System.out.println("count is " + count);
}
public static void printUrls(JSONArray childrens) {
JSONObject temp = null;
for (int i = 0; i < childrens.size(); i++) {
// get object using index from childrens array
temp = (JSONObject) childrens.get(i);
if (temp.containsKey("children")) {
printUrls((JSONArray) temp.get("children"));
}
// get url
String url = (String) temp.get("url");
if (url != null) {
System.out.println(url);
count++;
}
}
}
}
输出:OP提供的链接有2521个url,因此无法发布全部,但计数值应该足够
...
http://www.team-cymru.org/bogon-reference-http.html
http://www.team-cymru.org/bogon-reference-bgp.html
http://www.team-cymru.org/Services/Bogons/fullbogons-ipv4.txt
http://tecadmin.net/enable-logging-in-iptables-on-linux/#
https://www.youtube.com/watch?v=geglU1AdmJs&t=480s
count is 2521
关于java - 如何使用Java从json文件中提取每个元素(书签)作为一个项目?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38601001/