java - 如何深度复制具有初始容量的 map ?

标签 java serialization hashmap deserialization linkedhashmap

序列化和反序列化已成为深度复制具有复杂图形(How do you make a deep copy of an object in Java? 等)的对象的首选和接受方法,其中复制构造函数/工厂方法方法不太适合。

但是,这种方法不适用于指定初始容量的 map 。我之前关于此的问题 ( Why does specifying Map's initial capacity cause subsequent serializations to give different results? ) 显示生成的对象不相等,答案显示它们的字节表示不同:如果序列化给出 byte[] b1,则反序列化并再次序列化将给出与 b1 不同的 byte[] b2(至少有 1 个元素不同)。这与反序列化对象的通常行为形成对比。

控制反序列化过程的readObjectwriteObject 方法是private,因此不能被覆盖——这可能是故意的(Hashmap slower after deserialization - Why?)。

我正在对包含许多其他对象(包括 map )的对象使用反序列化深度复制方法。我还在对它们的字节数组表示形式进行比较和更改。只要 map 未使用初始容量参数进行初始化,一切都会正常进行。但是,如上所述,尝试通过指定初始容量来优化 map 会破坏这种方法。

我想知道是否可以规避这个问题,如果可以的话如何规避。

最佳答案

好的,所以,首先,您关注的是指定初始容量会导致不同的序列化字节这一事实,这是在找错树。事实上,如果你看一下区别:

pbA from your example:
: ac ed 00 05 73 72 00 0f 71 33 39 31 39 33 34 39   ....sr..q3919349
: 34 2e 53 74 61 74 65 00 00 00 00 00 00 00 01 02   4.State.........
: 00 01 4c 00 04 6d 61 70 73 74 00 10 4c 6a 61 76   ..L..mapst..Ljav
: 61 2f 75 74 69 6c 2f 4c 69 73 74 3b 78 70 73 72   a/util/List;xpsr
: 00 13 6a 61 76 61 2e 75 74 69 6c 2e 41 72 72 61   ..java.util.Arra
: 79 4c 69 73 74 78 81 d2 1d 99 c7 61 9d 03 00 01   yListx.....a....
: 49 00 04 73 69 7a 65 78 70 00 00 00 01 77 04 00   I..sizexp....w..
: 00 00 01 73 72 00 14 71 33 39 31 39 33 34 39 34   ...sr..q39193494
: 2e 4d 61 70 57 72 61 70 70 65 72 00 00 00 00 00   .MapWrapper.....
: 00 00 01 02 00 01 4c 00 03 6d 61 70 74 00 0f 4c   ......L..mapt..L
: 6a 61 76 61 2f 75 74 69 6c 2f 4d 61 70 3b 78 70   java/util/Map;xp
: 73 72 00 11 6a 61 76 61 2e 75 74 69 6c 2e 48 61   sr..java.util.Ha
: 73 68 4d 61 70 05 07 da c1 c3 16 60 d1 03 00 02   shMap......`....
: 46 00 0a 6c 6f 61 64 46 61 63 74 6f 72 49 00 09   F..loadFactorI..
: 74 68 72 65 73 68 6f 6c 64 78 70 3f 40 00 00 00   thresholdxp?@...
: 00 00 02 77 08 00 00 00 02 00 00 00 00 78 78      ...w.........xx 

zero from your example:
: ac ed 00 05 73 72 00 0f 71 33 39 31 39 33 34 39   ....sr..q3919349
: 34 2e 53 74 61 74 65 00 00 00 00 00 00 00 01 02   4.State.........
: 00 01 4c 00 04 6d 61 70 73 74 00 10 4c 6a 61 76   ..L..mapst..Ljav
: 61 2f 75 74 69 6c 2f 4c 69 73 74 3b 78 70 73 72   a/util/List;xpsr
: 00 13 6a 61 76 61 2e 75 74 69 6c 2e 41 72 72 61   ..java.util.Arra
: 79 4c 69 73 74 78 81 d2 1d 99 c7 61 9d 03 00 01   yListx.....a....
: 49 00 04 73 69 7a 65 78 70 00 00 00 01 77 04 00   I..sizexp....w..
: 00 00 01 73 72 00 14 71 33 39 31 39 33 34 39 34   ...sr..q39193494
: 2e 4d 61 70 57 72 61 70 70 65 72 00 00 00 00 00   .MapWrapper.....
: 00 00 01 02 00 01 4c 00 03 6d 61 70 74 00 0f 4c   ......L..mapt..L
: 6a 61 76 61 2f 75 74 69 6c 2f 4d 61 70 3b 78 70   java/util/Map;xp
: 73 72 00 11 6a 61 76 61 2e 75 74 69 6c 2e 48 61   sr..java.util.Ha
: 73 68 4d 61 70 05 07 da c1 c3 16 60 d1 03 00 02   shMap......`....
: 46 00 0a 6c 6f 61 64 46 61 63 74 6f 72 49 00 09   F..loadFactorI..
: 74 68 72 65 73 68 6f 6c 64 78 70 3f 40 00 00 00   thresholdxp?@...
: 00 00 00 77 08 00 00 00 01 00 00 00 00 78 78      ...w.........xx 

The only difference is the couple of bytes that specify load factor and such. Obviously, these bytes would be different - of course they would if you specify a different initial capacity that was ignored by the first deserialization. This is a red herring.

You are concerned about a corrupt deep copy, but this concern is misplaced. The only thing that matters, in terms of correctness, is the result of the deserialization. It just needs to be a correct, fully functional deep copy that doesn't violate any of your program's invariants. Focusing on the precise serialized bytes is a distraction: You don't care about them, in fact you only care that the result is correct.

Which brings us to the next point:

The only real issue you face here is a difference in long term performance (both speed and memory) characteristics from the fact that some Java versions ignore the initial map capacity when deserializing. This does not affect your data (that is, it will not break invariants), it only potentially affects performance.

So your very first step is to ensure that this is actually a problem. That is, it boils down to a potential premature optimization issue: Ignore the difference in the deserialized map's initial capacity for now. If your application runs with sufficient performance characteristics then you have nothing else to worry about. If it doesn't, and if you are able to narrow the bottlenecks down to decreased deserialized hash map performance due to a different initial capacity, only then should you approach this problem.

And so, the final part of this answer is, if you determine that the performance characteristics of the deserialized map actually are insufficient, there are a number of things you can do.

The simplest, most obvious one I can think of is to implement readResolve() on your object, and take that opportunity to:

  1. Construct a new map with the appropriate parameters (initial capacity, etc.)
  2. Copy all of the items from the old deserialized map to the new one.
  3. Then discard the old map and replace it with your new one.

Example (from your original code example, choosing the map that yielded the "false" result):

class MapWrapper implements Serializable {

    private static final long serialVersionUID = 1L;

    Map<String, Integer> map = new HashMap<>(2);

    private Object readResolve () throws ObjectStreamException {
        // Replace deserialized 'map' with one that has the desired
        // capacity parameters.
        Map<String, Integer> fixedMap = new HashMap<>(2);
        fixedMap.putAll(map);
        map = fixedMap;
        return this;
    }

}

但首先要问这是否真的给您带来了问题。我相信您想多了,过度关注字节对字节的序列化数据比较对您来说没有任何成效。

关于java - 如何深度复制具有初始容量的 map ?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39193494/

相关文章:

c# - 如何将纯文本json数据转成字符串?

java - 如何高效地取HashMap数组中多个HashMap的值的平均值?

java - 如何在 JSP 中循环遍历 HashMap?

struct - 遍历struct Hashmap并将值添加到self的另一部分

java - 奇怪的数组返回类型

java - 如何向 x 次 TCP 客户端请求 TCP 服务器数据?

java - 如何使用 JOINED 继承策略和 Hibernate 创建现有 super 对象的子对象

c++ - C/C++ 快速序列化 : Boost vs Cpickle vs Json vs Protocol buffer

c# - 有没有办法装饰一个 int 属性,使其序列化为一个字符串?

java - 如何在 java.util.concurrent.atomic 包中定义的类中实现原子性?