json - 为什么 JSON 编码 UTF-16 代理对而不是直接编码 Unicode 代码点？

标签 json unicode specifications ecma

To escape a code point that is not in the Basic Multilingual Plane, the character is represented as a twelve-character sequence, encoding the UTF-16 surrogate pair. So for example, a string containing only the G clef character (U+1D11E) may be represented as "\uD834\uDD1E".

ECMA-404:The JSON Data Interchange Format

我相信there is no need to encode this character at all ，所以可以直接表示为 "𝄞" .但是，如果希望对其进行编码，则必须按照规范将其编码为 "\uD834\uDD1E" , 不像 "\u1d11e" 那样(看起来很合理) .为什么是这样？

最佳答案

JSON 的关键架构特性之一是 JSON 编码的对象是有效的 Javascript 文字，可以使用 eval 进行评估。功能，例如。不幸的是，较旧的 Javascript 实现仅支持 16 位 Unicode 转义序列，在字符串文字中包含四个十六进制字符，因此除了以可移植的方式在转义序列中使用 UTF-16 代理来处理高于 0xFFFF 的代码点外，别无他法。 (允许任意代码点的 \u{...} 语法仅在 ECMAScript 6 中引入。)

但正如您所提到的，如果您的应用程序支持 Unicode JSON 文本，则无需使用转义序列。只需直接以相应的 Unicode 格式对字符进行编码。

关于json - 为什么 JSON 编码 UTF-16 代理对而不是直接编码 Unicode 代码点？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/38463038/

上一篇：validation - 在 Google 电子表格中复制/粘贴数据验证

下一篇：google-drive-api - 如何使用 google drive api for python？

相关文章：

javascript - 使用 Lodash 操作 JSON

python - 将unicode字符串拆分为单词

python - 如何在 Python 中将 Unicode 文件读取为 Unicode 字符串

javascript - 阅读 elem.style——规范保证结果格式吗？

java - GSON - 如何解析两个名称相同但参数不同的 JSONArray？

javascript - 如何检查 JSON Array 对象是否包含 Key

css - OL-LI 列出 Unicode 中的数字

java - 谁能简明扼要地描述 3 个闭包提案与它们在 Java 中的当前状态之间的区别？

python - distutils setup.py 和 %post %postun

ios - 在 SWIFT 4 中连接成功后找不到 JSON 解析数据