我在解析 UTF-16BE 格式的简单 XML 文档时遇到了很大的问题。 XML 看起来正确而且非常简单:
<?xml version="1.0"?>
<ACTCDOC xmlns="http://www.cip-bancos.org.br/ARQ/ACTC101PRO.xsd">
<BCARQ>
<NomArq>ACTC101_00360305_20140508_00010_PRO</NomArq>
<NumCtrlEmis>20140508000000000715</NumCtrlEmis>
<NumCtrlDestOr>10</NumCtrlDestOr>
<ISPBEmissor>02992335</ISPBEmissor>
<ISPBDestinatario>00360305</ISPBDestinatario>
<DtHrArq>2014-05-08T21:31:10</DtHrArq>
<DtRef>2014-05-08</DtRef>
</BCARQ>
</ACTCDOC>
我正在尝试使用以下代码进行解析:
@SuppressWarnings("unchecked")
public static <T> T lerXML(Class<T> clazz, InputStream in) throws JAXBException {
JAXBContext jaxbContext = JAXBContext.newInstance(clazz);
Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
T ret = (T) jaxbUnmarshaller.unmarshal(in);
return ret;
}
我的域名是这样的:
@XmlRootElement(name="ACTCDOC", namespace="http://www.cip-bancos.org.br/ARQ/ACTC101PRO.xsd")
@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "ACTCDOCPROComplexType", propOrder = {
"bcarq"
})
public class ACTCDOCPROComplexType {
@XmlElement(name = "BCARQ", required = true)
protected BCARQComplexType bcarq;
... getter and setters
}
@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "BCARQComplexType", propOrder = {
"nomArq",
"numCtrlEmis",
"numCtrlDestOr",
"ispbEmissor",
"ispbDestinatario",
"dtHrArq",
"sitReq",
"grupoSeq",
"dtRef"
})
public class BCARQComplexType {
@XmlElement(name = "NomArq", required = true)
protected String nomArq;
@XmlElement(name = "NumCtrlEmis", required = true)
protected String numCtrlEmis;
@XmlElement(name = "NumCtrlDestOr")
protected String numCtrlDestOr;
@XmlElement(name = "ISPBEmissor", required = true)
protected String ispbEmissor;
@XmlElement(name = "ISPBDestinatario", required = true)
protected String ispbDestinatario;
@XmlElement(name = "DtHrArq", required = true)
@XmlJavaTypeAdapter(DataHoraAdaptador.class)
protected XMLGregorianCalendar dtHrArq;
@XmlElement(name = "SitReq")
protected BigInteger sitReq;
@XmlElement(name = "Grupo_Seq")
protected GrupoSeqComplexType grupoSeq;
@XmlElement(name = "DtRef", required = true)
@XmlJavaTypeAdapter(DataAdaptador.class)
protected XMLGregorianCalendar dtRef;
.... getter and setters
}
当我解析 InputStream 并打印 objetc 时,BCARQ 元素为空,如下所示:
ACTCDOCPROComplexType doc = XMLUtil.lerXML(ACTCDOCPROComplexType.class, is);
System.out.println(doc.getBCARQ());
JAXB 与 UTF-16BE 配合良好??? 我也尝试了其他解决方案:在 Reader 中转换原始 InputStream 并将 UTF-16BE 转换为 UTF-8,但没有成功。代码如下:
public static <T> T lerXML(Class<T> clazz, InputStream in) throws JAXBException {
JAXBContext jaxbContext = JAXBContext.newInstance(clazz);
Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
try {
StringBuilder buf = new StringBuilder();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
String linha;
while ((linha = reader.readLine()) != null) {
buf.append(linha+"\n");
}
CharsetDecoder decoder = Charset.forName("UTF-16BE").newDecoder();
ByteBuffer bytes = ByteBuffer.wrap(buf.toString().getBytes());
String xmlUTF8 = decoder.decode(bytes).toString();
ByteArrayInputStream bis = new ByteArrayInputStream(xmlUTF8.getBytes());
ret = (T) jaxbUnmarshaller.unmarshal(in);
return ret;
} catch (IOException e) {
throw new JAXBException(e);
}
}
但是在这种形式下我得到了以下错误:
[org.xml.sax.SAXParseException: Premature end of file.]
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.createUnmarshalException(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(Unknown Source)
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Source)
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Source)
at br.gov.caixa.sigec.util.XMLUtil.lerXML(XMLUtil.java:124)
at br.gov.caixa.sigec.negocio.preprocessador.PreProcessadorACTC101PRO.main(PreProcessadorACTC101PRO.java:99)
Caused by: org.xml.sax.SAXParseException: Premature end of file.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
... 6 more
有什么想法吗? 谢谢
最佳答案
您的 XML 应在 header 中包含编码。
<?xml version="1.0" encoding="UTF-16BE"?>
如果出于某种原因您不能在 XML header 中包含编码,那么您可以使用 Reader
尝试以下方式:
InputStream inputStream = new FileInputStream("input.xml");
Reader reader = new InputStreamReader(inputStream, "UTF-16BE");
Object result = unmarshaller.unmarshal(reader);
或者,尝试使用 StAX XMLStreamReader
解析 XML,然后让 Unmarshaller
解码它。
关于java - 使用 UTF-16BE 的 JAXB 空元素过早结束文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23612048/