java - 使用 UTF-16BE 的 JAXB 空元素过早结束文件

标签 java xml encoding utf-8 jaxb

我在解析 UTF-16BE 格式的简单 XML 文档时遇到了很大的问题。 XML 看起来正确而且非常简单:

<?xml version="1.0"?>
<ACTCDOC xmlns="http://www.cip-bancos.org.br/ARQ/ACTC101PRO.xsd">
    <BCARQ>
        <NomArq>ACTC101_00360305_20140508_00010_PRO</NomArq>
        <NumCtrlEmis>20140508000000000715</NumCtrlEmis>
        <NumCtrlDestOr>10</NumCtrlDestOr>
        <ISPBEmissor>02992335</ISPBEmissor>
        <ISPBDestinatario>00360305</ISPBDestinatario>
        <DtHrArq>2014-05-08T21:31:10</DtHrArq>
        <DtRef>2014-05-08</DtRef>
    </BCARQ>
</ACTCDOC>

我正在尝试使用以下代码进行解析:

@SuppressWarnings("unchecked")
public static <T> T lerXML(Class<T> clazz, InputStream in) throws JAXBException {
    JAXBContext jaxbContext = JAXBContext.newInstance(clazz);
    Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();

    T ret = (T) jaxbUnmarshaller.unmarshal(in);

    return ret;
}

我的域名是这样的:

@XmlRootElement(name="ACTCDOC", namespace="http://www.cip-bancos.org.br/ARQ/ACTC101PRO.xsd")
@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "ACTCDOCPROComplexType", propOrder = {
    "bcarq"
})
public class ACTCDOCPROComplexType {

    @XmlElement(name = "BCARQ", required = true)
    protected BCARQComplexType bcarq;

    ... getter and setters
}

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "BCARQComplexType", propOrder = {
    "nomArq",
    "numCtrlEmis",
    "numCtrlDestOr",
    "ispbEmissor",
    "ispbDestinatario",
    "dtHrArq",
    "sitReq",
    "grupoSeq",
    "dtRef"
})
public class BCARQComplexType {

    @XmlElement(name = "NomArq", required = true)
    protected String nomArq;
    @XmlElement(name = "NumCtrlEmis", required = true)
    protected String numCtrlEmis;
    @XmlElement(name = "NumCtrlDestOr")
    protected String numCtrlDestOr;
    @XmlElement(name = "ISPBEmissor", required = true)
    protected String ispbEmissor;
    @XmlElement(name = "ISPBDestinatario", required = true)
    protected String ispbDestinatario;
    @XmlElement(name = "DtHrArq", required = true)
    @XmlJavaTypeAdapter(DataHoraAdaptador.class)
    protected XMLGregorianCalendar dtHrArq;
    @XmlElement(name = "SitReq")
    protected BigInteger sitReq;
    @XmlElement(name = "Grupo_Seq")
    protected GrupoSeqComplexType grupoSeq;
    @XmlElement(name = "DtRef", required = true)
    @XmlJavaTypeAdapter(DataAdaptador.class)
    protected XMLGregorianCalendar dtRef;

    .... getter and setters
}

当我解析 InputStream 并打印 objetc 时,BCARQ 元素为空,如下所示:

ACTCDOCPROComplexType doc = XMLUtil.lerXML(ACTCDOCPROComplexType.class, is);

System.out.println(doc.getBCARQ());

JAXB 与 UTF-16BE 配合良好??? 我也尝试了其他解决方案:在 Reader 中转换原始 InputStream 并将 UTF-16BE 转换为 UTF-8,但没有成功。代码如下:

public static <T> T lerXML(Class<T> clazz, InputStream in) throws JAXBException {
    JAXBContext jaxbContext = JAXBContext.newInstance(clazz);
    Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();

    try {
        StringBuilder buf = new StringBuilder();
        BufferedReader reader = new BufferedReader(new InputStreamReader(in));
        String linha;
        while ((linha = reader.readLine()) != null) {
        buf.append(linha+"\n");
        }

        CharsetDecoder decoder = Charset.forName("UTF-16BE").newDecoder();
        ByteBuffer bytes = ByteBuffer.wrap(buf.toString().getBytes());
        String xmlUTF8 = decoder.decode(bytes).toString();

        ByteArrayInputStream bis = new ByteArrayInputStream(xmlUTF8.getBytes());
        ret = (T) jaxbUnmarshaller.unmarshal(in);

        return ret;
    } catch (IOException e) {
        throw new JAXBException(e);
    }
}

但是在这种形式下我得到了以下错误:

[org.xml.sax.SAXParseException: Premature end of file.]
    at javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.createUnmarshalException(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(Unknown Source)
    at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Source)
    at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Source)
    at br.gov.caixa.sigec.util.XMLUtil.lerXML(XMLUtil.java:124)
    at br.gov.caixa.sigec.negocio.preprocessador.PreProcessadorACTC101PRO.main(PreProcessadorACTC101PRO.java:99)
Caused by: org.xml.sax.SAXParseException: Premature end of file.
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    ... 6 more

有什么想法吗? 谢谢

最佳答案

您的 XML 应在 header 中包含编码。

<?xml version="1.0" encoding="UTF-16BE"?>

如果出于某种原因您不能在 XML header 中包含编码,那么您可以使用 Reader 尝试以下方式:

InputStream inputStream = new FileInputStream("input.xml");
Reader reader = new InputStreamReader(inputStream, "UTF-16BE");
Object result = unmarshaller.unmarshal(reader);

或者,尝试使用 StAX XMLStreamReader 解析 XML,然后让 Unmarshaller 解码它。

关于java - 使用 UTF-16BE 的 JAXB 空元素过早结束文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23612048/

相关文章:

java - JPA.withTransaction 执行其他 Controller 方法错误 : Global. java:39: 错误: 'void' 此处不允许类型

java - Java 中的异常翻译与异常链接

php - HTML 表单数据 > PHP > MySQL UTF 编码(西里尔字母)

html - 更喜欢 HTML 元标记或 HTTP header 中的字符集声明?

ruby-on-rails - 直接显示使用 chunkypng 创建的图像(不保存)

java - 在同一包中但单独的文件中使用另一个类

java - 从类路径资源 [applicationContext.xml] 解析 XML 文档时发生意外异常;

Javascript 格式化异常

java - XML 序列化真的与 XML 数据绑定(bind)不同吗?如果不同,又有何不同?

c# - XmlSerializer 相当于 IExtensibleDataObject