在我的 XML 中有一个多行元素:
<tag id="sometag" ...>
| first line
| second line
| third line
| fourth line
<tag ...>
....
<tag id="someothertag" ...>
| ANOTHER FIRST LINE
| ANOTHER SECOND LINE
| ANOTHER THIRD LINE
| ANOTHER FORTH LINE
<tag ...>
然后在 Java 中,我有必要的 startElement
、endElement
和 characters
方法,但我发现使用 字符
:
public void characters(char[] ch, int start, int length){
Log.d(TAG, "characters( "\"" + (new String(ch)).replaceAll("[\r\n]", "\\n") + "\", " + start + ", " + length + " )");
}
除此之外,我没有对角色做任何事情。我基本上是在创建一个解析器的两个实例。在一个实例中,我正在搜索 sometag
。如果找到我要查找的内容并返回该元素,我将抛出异常。
D/MyProgram( 1565): STARTING document parsing...
D/MyProgram( 1565): characters( "n ", 0, 1 )
D/MyProgram( 1565): characters( " | first line", 0, 20 )
D/MyProgram( 1565): characters( "n | first line", 0, 1 )
D/MyProgram( 1565): characters( " | second line", 0, 23 )
D/MyProgram( 1565): characters( "n | second line", 0, 1 )
D/MyProgram( 1565): characters( " | third line", 0, 26 )
D/MyProgram( 1565): characters( "n | third line", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 22 )
D/MyProgram( 1565): characters( "n | fourth lineline", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 4 )
D/MyProgram( 1565): Successfully found "sometag"!
...对于另一个全新的实例,我正在搜索 someothertag
。我做和以前一样的事情。
D/MyProgram( 1565): STARTING document parsing...
D/MyProgram( 1565): characters( "n", 0, 1 )
D/MyProgram( 1565): characters( " ", 0, 4 )
D/MyProgram( 1565): characters( "n ", 0, 1 )
D/MyProgram( 1565): characters( " | first line", 0, 20 )
D/MyProgram( 1565): characters( "n | first line", 0, 1 )
D/MyProgram( 1565): characters( " | second line", 0, 23 )
D/MyProgram( 1565): characters( "n | second line", 0, 1 )
D/MyProgram( 1565): characters( " | third line", 0, 26 )
D/MyProgram( 1565): characters( "n | third line", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 22 )
D/MyProgram( 1565): characters( "n | fourth lineline", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 4 )
D/MyProgram( 1565): Successfully found "someothertag"!
我知道 XML 解析是基于流的(它解析 block 而不是整个字符串),但这是非常奇怪的行为。以下是我注意到的一些非常令人困惑的事情:
- 对于 characters() 的每次迭代,解析器都不会从它停止的地方开始或结束字符,如果它确实完成了解析:我什至得到了 before第一个字符数组('
n
',它是换行符的替换)。 ch
具有原本不存在的额外字符:“line
”附加到“forth line
”。- 当我创建一个全新的解析器实例时,字符被“重新读取”。第二次执行应该是这样的:
..这...
D/MyProgram( 1565): characters( "n", 0, 1 )
D/MyProgram( 1565): characters( " ", 0, 4 )
D/MyProgram( 1565): characters( "n ", 0, 1 )
D/MyProgram( 1565): characters( " | ANOTHER FIRST LINE", 0, 20 )
D/MyProgram( 1565): characters( "n | ANOTHER SECOND LINE", 0, 1 )
...等等。
知道我做错了什么吗?提前致谢。
最佳答案
正如 Margulies 所说,您没有在传递的字符数组中使用 start
和 length
。
public void characters(char[] ch, int start, int length) {
// use only the indicated segment.
String str = new String( ch, start, length);
Log.d(TAG, "characters( "\"" + str.replaceAll("[\r\n]", "\\n") + "\", " + start + ", " + length + " )");
}
关于java - SAX + Java 的奇怪 characters() 行为,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19237203/