我正在尝试解析一个包含竖线分隔字符串的文件,其中应该有 8 列。但在某些情况下,列数少于预期。在这种情况下,我会收到 ArrayIndexOutOfBoundsException
,因为数组大小小于我的预期。
如何处理?如果没有数据,我总是希望具有相同的数组长度和空白值。
示例数据:
在下面的示例数据中,第一行按预期正常工作,但其他 3 行失败。
1-chloro-4-nitrobenzene|100-00-5||157.553 |NO2C6H4Cl||400|FID1GC/MSGCMS
geranyl butyrate|106-29-6||224.34|C14H24O2|||
4’-methoxyacetophenone|100-06-1||150.18|C9H10O2|||
p-Anisic Acid|100-09-4|152.047|152.149|C8H8O3||400|
结果
Invalid: column size : [5], line : geranyl butyrate|106-29-6||224.34|C14H24O2|||
Invalid: column size : [5], line : 4’-methoxyacetophenone|100-06-1||150.18|C9H10O2|||
Invalid: column size : [7], line : p-Anisic Acid|100-09-4|152.047|152.149|C8H8O3||400|
Java 代码
@Test
public void testComponentsFileParsing3() {
String fileName = "src/main/resources/admin/bulkupload_by_api/comp.txt";
BufferedReader reader = null;
try {
reader = Files.newBufferedReader(Paths.get(fileName));
String line = null;
while ((line = reader.readLine()) != null) {
String columns[] = line.split(Pattern.quote("|"));
//String columns[] = StringUtils.split(line,"\\|");
//String columns[] = line.split("\\|");
String description = null;
String code = null; // code & cas number are same
String casNumber = null; // code & cas number are same
String accurateMass = null;
String molecularWeight = null;
String molecularFormula = null;
String ozoneDepletingSubstance = null;
int estimatedShelfLife = 0;
String technique = null;
try {
description = columns[0];
code = columns[1]; // code & cas number are same
casNumber = columns[1]; // code & cas number are same
accurateMass = columns[2];
molecularWeight = columns[3];
molecularFormula = columns[4];
ozoneDepletingSubstance = columns[5];
estimatedShelfLife = NumberUtils.toInt(columns[6]);
technique = columns[7];
} catch (ArrayIndexOutOfBoundsException ae) {
System.out.println("Invalid: column size : [" + columns.length + "], line : " + line);
continue;
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
最佳答案
这符合 docs 的预期:
This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
您可以调用 two-argument variant使用 limit = -1
将所有尾随的空元素包含在结果中,或者可能使用 limit = 8
(或任何您期望的列数)如果适合您案例更好。
无论哪种方式,您都应该在之后检查实际的数组长度以捕获任何错误的输入。
关于java - 在某些情况下,解析中间没有值的管道分隔字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56969429/