java - String.getBytes ("UTF-16") 会在所有平台上返回相同的结果吗？

标签 java string encoding

我需要从包含用户密码的字符串创建哈希。为了创建散列，我使用了一个字节数组，它是通过调用 String.getBytes() 获得的。但是，当我在不是默认编码的平台上使用指定编码(例如 UTF-8)调用此方法时，非 ASCII 字符将被默认字符替换(如果我正确理解 getBytes() 的行为) 因此在这样的平台上，我将得到一个不同的字节数组，并最终得到一个不同的散列。

由于字符串在内部存储为 UTF-16，将调用 String.getBytes("UTF-16") 保证我在每个平台上获得相同的字节数组，无论其默认编码如何？

最佳答案

是的。它不仅保证是 UTF-16，而且 the byte order is defined too :

When decoding, the UTF-16 charset interprets the byte-order mark at the beginning of the input stream to indicate the byte-order of the stream but defaults to big-endian if there is no byte-order mark; when encoding, it uses big-endian byte order and writes a big-endian byte-order mark.

(当调用者不要求时，BOM 不相关，因此 String.getBytes(...) 不会包含它。)

只要您具有相同的字符串内容——即相同的 char 值序列——那么您将在 Java 的每个实现中获得相同的字节，除非存在错误。 (考虑到 UTF-16 可能是在 Java 中实现的最简单的编码，任何此类错误都会令人惊讶……)

事实上，UTF-16 是 char(通常是 String)的本地表示形式，但这仅与实现的简易性有关。例如，我还期望 String.getBytes("UTF-8") 在每个平台上给出相同的结果。

关于java - String.getBytes ("UTF-16") 会在所有平台上返回相同的结果吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/25876632/

上一篇：java - Derby "A truncation error was encountered trying to shrink CLOB ' <流值 >' to length 255"

下一篇：java - Hibernate SQLQuery - 按名称获取对象

java - Java中如何将字符、数字、符号的组合转换为字符串

c++ - 字符串连接错误

java - 将元素按升序插入多链表

java - ButtonSprite OnClickListener 不起作用

java - Camel in Action 中糟糕的 Hello world Camel 示例(帮助修复它!)

java - 我们如何在 O(n) 时间内实现 "substring-match"？

mysql - ActiveRecord 在 Ruby 1.9.2-rc1 下以 ASCII-8Bit 返回数据

python - 高效(时间和空间)字典数据库(唯一词到 uniq id 和返回)

javascript - 从 c# PhysicalFile 通过 json 发送二进制数据会导致字符编码丢失