unicode - 从 utf8mb4_unicode_520_ci 升级 MariaDB 排序规则

标签 unicode mariadb

目前我使用:

  • utf8mb4 数据库字符集。
  • utf8mb4_unicode_520_ci 数据库排序规则。

据我所知,utf8mb4 支持每个字符最多四个字节。我还了解到 Unicode 是一个不断更新的标准。过去我认为 utf8 就足够了,直到我有一些测试数据被损坏,吸取了教训。但是,我很难理解字符集和排序规则的升级路径。

utf8mb4_unicode_520_ci 数据库排序规则基于 Unicode Collation Algorithm version 5.2.0 。如果导航到父目录,您将看到键入此内容时列出的最高版本 14.0。现在这些是 Unicode 标准,然后是受支持的 MariaDB character sets and collations .

临时我不确定何时需要从每个字符 4 个字节取代为每个字符 8 个字节甚至 16 个字节,因此这并不是更新数据库排序规则那么简单的措施。此外,我在 MariaDB 的文档中没有看到任何似乎比 5.2.0 版本更新的内容。

简而言之,我的三个高度相关的问题是:

  • 版本 14 等较新的排序规则是否仍与四字节字符完全兼容,或者它们是否已用尽所有组合,现在每个字符最多需要 8 或 16 个字节?
  • MariaDB 支持的 Unicode 版本的最新数据库排序规则是什么?
  • 关于第二个问题,MariaDB 支持比 5.2.0 更新的版本,那么 utf8mb4 对于字符集是否仍然足够?

我不受约束也不关心 MySQL 兼容性。

最佳答案

MariaDB 跳过旧的 5.2.0 版本,在 MariaDB-10.10.2 中添加了 UCA-14.0.0 排序规则,并且在 MariaDB-10.11+ 版本中也可用。

14.0.0 排序规则还包括 accent insensitivity作为可选的排序规则属性。

此版本还支持缩写。

列表是:

MariaDB [(none)]> SELECT * FROM INFORMATION_SCHEMA.COLLATIONS where collation_name like 'uca1400_%';
+--------------------------------+--------------------+------+------------+-------------+---------+
| COLLATION_NAME                 | CHARACTER_SET_NAME | ID   | IS_DEFAULT | IS_COMPILED | SORTLEN |
+--------------------------------+--------------------+------+------------+-------------+---------+
| uca1400_ai_ci                  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_ai_cs                  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_as_ci                  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_as_cs                  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_nopad_ai_ci            | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_nopad_ai_cs            | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_nopad_as_ci            | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_nopad_as_cs            | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_icelandic_ai_ci        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_icelandic_ai_cs        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_icelandic_as_ci        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_icelandic_as_cs        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_icelandic_nopad_ai_ci  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_icelandic_nopad_ai_cs  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_icelandic_nopad_as_ci  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_icelandic_nopad_as_cs  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_latvian_ai_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_latvian_ai_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_latvian_as_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_latvian_as_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_latvian_nopad_ai_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_latvian_nopad_ai_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_latvian_nopad_as_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_latvian_nopad_as_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_romanian_ai_ci         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_romanian_ai_cs         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_romanian_as_ci         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_romanian_as_cs         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_romanian_nopad_ai_ci   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_romanian_nopad_ai_cs   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_romanian_nopad_as_ci   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_romanian_nopad_as_cs   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovenian_ai_ci        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovenian_ai_cs        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovenian_as_ci        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovenian_as_cs        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovenian_nopad_ai_ci  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovenian_nopad_ai_cs  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovenian_nopad_as_ci  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovenian_nopad_as_cs  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_polish_ai_ci           | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_polish_ai_cs           | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_polish_as_ci           | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_polish_as_cs           | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_polish_nopad_ai_ci     | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_polish_nopad_ai_cs     | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_polish_nopad_as_ci     | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_polish_nopad_as_cs     | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_estonian_ai_ci         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_estonian_ai_cs         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_estonian_as_ci         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_estonian_as_cs         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_estonian_nopad_ai_ci   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_estonian_nopad_ai_cs   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_estonian_nopad_as_ci   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_estonian_nopad_as_cs   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish_ai_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish_ai_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish_as_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish_as_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish_nopad_ai_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish_nopad_ai_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish_nopad_as_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish_nopad_as_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_swedish_ai_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_swedish_ai_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_swedish_as_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_swedish_as_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_swedish_nopad_ai_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_swedish_nopad_ai_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_swedish_nopad_as_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_swedish_nopad_as_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_turkish_ai_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_turkish_ai_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_turkish_as_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_turkish_as_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_turkish_nopad_ai_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_turkish_nopad_ai_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_turkish_nopad_as_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_turkish_nopad_as_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_czech_ai_ci            | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_czech_ai_cs            | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_czech_as_ci            | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_czech_as_cs            | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_czech_nopad_ai_ci      | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_czech_nopad_ai_cs      | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_czech_nopad_as_ci      | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_czech_nopad_as_cs      | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_danish_ai_ci           | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_danish_ai_cs           | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_danish_as_ci           | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_danish_as_cs           | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_danish_nopad_ai_ci     | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_danish_nopad_ai_cs     | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_danish_nopad_as_ci     | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_danish_nopad_as_cs     | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_lithuanian_ai_ci       | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_lithuanian_ai_cs       | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_lithuanian_as_ci       | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_lithuanian_as_cs       | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_lithuanian_nopad_ai_ci | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_lithuanian_nopad_ai_cs | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_lithuanian_nopad_as_ci | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_lithuanian_nopad_as_cs | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovak_ai_ci           | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovak_ai_cs           | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovak_as_ci           | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovak_as_cs           | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovak_nopad_ai_ci     | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovak_nopad_ai_cs     | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovak_nopad_as_ci     | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_slovak_nopad_as_cs     | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish2_ai_ci         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish2_ai_cs         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish2_as_ci         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish2_as_cs         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish2_nopad_ai_ci   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish2_nopad_ai_cs   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish2_nopad_as_ci   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_spanish2_nopad_as_cs   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_roman_ai_ci            | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_roman_ai_cs            | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_roman_as_ci            | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_roman_as_cs            | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_roman_nopad_ai_ci      | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_roman_nopad_ai_cs      | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_roman_nopad_as_ci      | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_roman_nopad_as_cs      | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_persian_ai_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_persian_ai_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_persian_as_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_persian_as_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_persian_nopad_ai_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_persian_nopad_ai_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_persian_nopad_as_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_persian_nopad_as_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_esperanto_ai_ci        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_esperanto_ai_cs        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_esperanto_as_ci        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_esperanto_as_cs        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_esperanto_nopad_ai_ci  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_esperanto_nopad_ai_cs  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_esperanto_nopad_as_ci  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_esperanto_nopad_as_cs  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_hungarian_ai_ci        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_hungarian_ai_cs        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_hungarian_as_ci        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_hungarian_as_cs        | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_hungarian_nopad_ai_ci  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_hungarian_nopad_ai_cs  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_hungarian_nopad_as_ci  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_hungarian_nopad_as_cs  | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_sinhala_ai_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_sinhala_ai_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_sinhala_as_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_sinhala_as_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_sinhala_nopad_ai_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_sinhala_nopad_ai_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_sinhala_nopad_as_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_sinhala_nopad_as_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_german2_ai_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_german2_ai_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_german2_as_ci          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_german2_as_cs          | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_german2_nopad_ai_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_german2_nopad_ai_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_german2_nopad_as_ci    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_german2_nopad_as_cs    | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_vietnamese_ai_ci       | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_vietnamese_ai_cs       | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_vietnamese_as_ci       | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_vietnamese_as_cs       | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_vietnamese_nopad_ai_ci | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_vietnamese_nopad_ai_cs | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_vietnamese_nopad_as_ci | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_vietnamese_nopad_as_cs | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_croatian_ai_ci         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_croatian_ai_cs         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_croatian_as_ci         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_croatian_as_cs         | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_croatian_nopad_ai_ci   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_croatian_nopad_ai_cs   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_croatian_nopad_as_ci   | NULL               | NULL | NULL       | Yes         |       8 |
| uca1400_croatian_nopad_as_cs   | NULL               | NULL | NULL       | Yes         |       8 |
+--------------------------------+--------------------+------+------------+-------------+---------+
184 rows in set (0.001 sec)

引用:MDEV-27009

关于unicode - 从 utf8mb4_unicode_520_ci 升级 MariaDB 排序规则,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68308706/

相关文章:

mysql - SQL 错误 (1064) 语法 MariaDB

mysql - 如何在 CentOS 中加密现有的 mariadb 表?

mysql - 在他们从未为 PS4 开发过游戏的地方选择 game_co

react-native - react 原生子弹字符?或统一码?

java - 获取日期之间的数据

perl - 在 Windows 上以 unicode 名称打开文件夹中的文件

javascript - 谁执行 unicode 规范化以及何时执行?

mysql - 如何在 LEFT JOIN 上使用 LIMIT?

ios - 在单词或字符边界处截断包含表情符号或 unicode 字符的字符串

unicode - DB2 VARCHAR unicode 数据存储