Java压缩字节数组和base 64编码转base 64解码解压字节数组报错: different sized input/output arrays

我的应用程序需要一个编码为字节数组的 double 列表，该字节数组采用小端编码，经过 zlib 压缩，然后编码为 base 64。我编写了一个工具来测试我的编码，但它不起作用。我能够取得进步。

但是，我注意到当我尝试解压缩到固定大小的缓冲区时，我能够得出解压缩字节数组的大小小于原始字节数组的输入，这显然是不正确的.与此同时，列表中的最后一个 double 消失了。在大多数输入上，固定缓冲区大小会再现输入。有谁知道为什么会这样？我猜错误出在我编码数据的方式上，但我不知道出了什么问题。

当我尝试使用 ByteArrayOutputStream 来处理任意大小的可变长度输出时(这对于代码的真实版本很重要，因为我不能保证最大大小限制)，Inflater 的 inflate 方法不断返回 0 .我查阅了文档，它说这意味着它需要更多数据。由于没有更多数据，我再次怀疑我的编码，并猜测这是导致前面解释的行为的同一个问题。

在我的代码中，我包含了一个适用于固定缓冲区大小的数据示例，以及不适用于固定缓冲区大小的数据。这两个数据集都会导致我解释的可变缓冲区大小错误。

关于我做错了什么的任何线索？非常感谢。

import java.io.ByteArrayOutputStream;
import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.util.ArrayList;
import java.util.zip.DataFormatException;
import java.util.zip.Deflater;
import java.util.zip.Inflater;
import org.apache.commons.codec.binary.Base64;

public class BinaryReaderWriter {
    public static void main(String [ ] args) throws UnsupportedEncodingException, DataFormatException
{
    // this input will break the fixed buffer method
    //double[] centroids = {123.1212234143345453223123123, 28464632322456781.23, 3123121.0};

    // this input will break the fixed buffer method
    double[] centroids = {123.1212234143345453223123123, 28464632322456781.23, 31.0};
    BinaryReaderWriter brw = new BinaryReaderWriter();
    String output = brw.compressCentroids(centroids);
    brw.decompressCentroids(output);
}
void decompressCentroids(String encoded) throws DataFormatException{
    byte[] binArray = Base64.decodeBase64(encoded);


    // This block of code is the fixed buffer version
    //
System.out.println("binArray length " + binArray.length);
    Inflater deCompressor = new Inflater();
    deCompressor.setInput(binArray, 0, binArray.length);
    byte[] decompressed = new byte[1024];
    int decompressedLength = deCompressor.inflate(decompressed);
    deCompressor.end();
System.out.println("decompressedLength = " + decompressedLength);
    byte[] decompressedData = new byte[decompressedLength];
    for(int i=0;i<decompressedLength;i++){
        decompressedData[i] = decompressed[i];
    }


    /*
    // This block of code is the variable buffer version
    //
    ByteArrayOutputStream bos = new ByteArrayOutputStream(binArray.length);
    Inflater deCompressor = new Inflater();
    deCompressor.setInput(binArray, 0, binArray.length);
    byte[] decompressed = new byte[1024];
    while (!deCompressor.finished()) {
        int decompressedLength = deCompressor.inflate(decompressed);
        bos.write(decompressed, 0, decompressedLength);
    }
    deCompressor.end();
    byte[] decompressedData = bos.toByteArray();
    */

    ByteBuffer bb = ByteBuffer.wrap(decompressedData);
    bb.order(ByteOrder.LITTLE_ENDIAN);
System.out.println("decompressedData length = " + decompressedData.length);
    double[] doubleValues = new double[decompressedData.length / 8];
    for (int i = 0; i< doubleValues.length; i++){
        doubleValues[i] = bb.getDouble(i * 8);
    }

    for(double dbl : doubleValues){
        System.out.println(dbl);
    }   
}

String compressCentroids(double[] centroids){
    byte[] cinput = new byte[centroids.length * 8];
    ByteBuffer buf = ByteBuffer.wrap(cinput);
    buf.order(ByteOrder.LITTLE_ENDIAN);
    for (double cent : centroids){
        buf.putDouble(cent);
    }

    byte[] input = buf.array();
System.out.println("raw length = " + input.length);
    byte[] output = new byte[input.length];
    Deflater compresser = new Deflater();
    compresser.setInput(input);
    compresser.finish();
    int compressedLength = compresser.deflate(output);
    compresser.end();
System.out.println("Compressed length = " + compressedLength);
    byte[] compressed = new byte[compressedLength];
    for(int i = 0; i < compressedLength; i++){
        compressed[i] = output[i];
    }

    String decrypted = Base64.encodeBase64String(compressed);
    return decrypted;
}

}

最佳答案

当压缩数据时，我们真正做的是重新编码以增加数据的熵。在重新编码过程中，我们必须添加元数据来告诉我们我们如何对数据进行编码，以便将其转换回之前的状态。

只有当元数据大小小于我们通过重新编码数据节省的空间时，压缩才会成功。

考虑霍夫曼编码:

Huffman 是一种简单的编码方案，我们将 固定宽度字符集 替换为 可变宽度字符集 加上字符集长度表。由于显而易见的原因，长度表大小将大于 0。如果所有字符都以接近均等的分布出现，我们将无法节省任何空间。所以我们的压缩数据最终会比未压缩的数据大。

关于Java压缩字节数组和base 64编码转base 64解码解压字节数组报错: different sized input/output arrays，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/20622837/

Java压缩字节数组和base 64编码转base 64解码解压字节数组报错: different sized input/output arrays

上一篇：java - 这两种语法\\d\\d\\d 和\\d{3} 在所有环境中都一样吗？

下一篇：java - 错误 : java. lang.NumberFormatException: