我正在尝试使用 Gson 流
解析一个巨大的 JSON
数组,每次运行时,我只需一次处理 10 个对象。
因此,在第一次运行时,它会处理 10。在第二次运行时,它会从 11 开始。第三,从 21 日开始,依此类推...您明白了。
JSON 数组的格式为:
[
{ "key1": "value1"},
{ "key2": "value2"},
{ "key3": "value3"},
{ "key4": "value4"},
..........
.........
..........
{ "key10": "value10"},
..........
.........
..........
{ "key20": "value20"},
..........
.........
..........
]
我正在尝试下面的代码,但似乎无法正常工作,并且总是从头开始解析。这就是我正在做的事情:
public static void readJsonStream(int skipRows) {
JsonReader reader = null;
String FILENAME = "/examples/uh_data.json";
final InputStream stream = UHReportParser.class.getClass().getResourceAsStream(FILENAME);
try {
reader = new JsonReader(new InputStreamReader(stream, "UTF-8"));
Gson gson = new GsonBuilder().create();
// Read file in stream mode
reader.beginArray();
int count = 1;
while (reader.hasNext()) {
if (count++<=skipRows){
continue;
} else if(count>skipRows+10){
break;
}
else{
UserData data = null;
// Read data into object model
data = gson.fromJson(reader, UserData.class); //starts from one again
String description = data.getDescription();
}
}
} catch (UnsupportedEncodingException ex) {
ex.printStackTrace();
} catch (IOException ex) {
if (reader != null) {
try {
reader.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
这里应该修改什么?如何才能达到预期的效果?
最佳答案
我没有深入分析你的算法,但它似乎并没有在“跳过”阶段跳过值,我肯定会重构你的 JSON 流读取器,以使其尽可能干净(至少对于我可以做什么)。 这也将允许您尽可能地重用这样的方法。 考虑以下方法:
static void readArrayBySkipAndLimitFromBegin(final JsonReader jsonReader, final int skip, final int limit,
final Consumer<? super JsonReader> callback)
throws IOException {
readArrayBySkipAndLimit(jsonReader, skip, limit, true, false, callback);
}
static void readArrayBySkipAndLimit(final JsonReader jsonReader, final int skip, final int limit, final boolean processBegin,
final boolean processEnd, final Consumer<? super JsonReader> callback)
throws IOException {
// the JSON stream can be already processed somehow
if ( processBegin ) {
jsonReader.beginArray();
}
// just skip the `skip`
for ( int i = 0; i < skip && jsonReader.hasNext(); i++ ) {
jsonReader.skipValue();
}
// and limit to the `limit` just passing the JsonReader instance to its consumer elsewhere
for ( int i = 0; i < limit && jsonReader.hasNext(); i++ ) {
callback.accept(jsonReader);
}
// in case you need it ever...
if ( processEnd ) {
while ( jsonReader.hasNext() ) {
jsonReader.skipValue();
}
jsonReader.endArray();
}
}
这是我用来测试它的 JSON 文档(总共 32 个数组元素):
[
{"key1": "value1"},
{"key2": "value2"},
...
{"key31": "value31"},
{"key32": "value32"}
]
现在,测试一下:
private static final Gson gson = new Gson();
private static final Type mapOfStringToStringType = new TypeToken<Map<String, String>>() {}.getType();
public static void main(final String... args)
throws IOException {
// read up to 2B+ entries, every 10 rows
for ( int i = 0; i >= 0; i += 10 ) {
System.out.print("Step #" + i / 10 + ": ");
final Collection<Map<String, String>> maps = new ArrayList<>();
// consume and close
try ( final JsonReader jsonReader = Resources.getPackageResourceJsonReader(Q50737654.class, "array.json") ) {
// consume the JSON reader, parse each array page element and add it to the result collection
readArrayBySkipAndLimitFromBegin(jsonReader, i, 10, jr -> maps.add(gson.fromJson(jr, mapOfStringToStringType)));
}
System.out.println(maps);
if ( maps.isEmpty() ) {
break;
}
}
System.out.println("Done");
}
示例输出:
Step #0: [{key1=value1}, {key2=value2}, {key3=value3}, {key4=value4}, {key5=value5}, {key6=value6}, {key7=value7}, {key8=value8}, {key9=value9}, {key10=value10}]
Step #1: [{key11=value11}, {key12=value12}, {key13=value13}, {key14=value14}, {key15=value15}, {key16=value16}, {key17=value17}, {key18=value18}, {key19=value19}, {key20=value20}]
Step #2: [{key21=value21}, {key22=value22}, {key23=value23}, {key24=value24}, {key25=value25}, {key26=value26}, {key27=value27}, {key28=value28}, {key29=value29}, {key30=value30}]
Step #3: [{key31=value31}, {key32=value32}]
Step #4: []
Done
如您所见,这非常简单。
关于java - Gson:如何在使用流 api 解析时跳过 JSON 数组中的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50737654/