gpt4 book ai didi

mysql - MySQL 中的 utf8mb4 和 utf8 字符集有什么区别?

转载 作者:IT老高 更新时间:2023-10-28 12:48:05 26 4
gpt4 key购买 nike

MySQL中的utf8mb4utf8字符集有什么区别?

我已经知道 ASCIIUTF-8UTF-16UTF-32 编码;但我很想知道 utf8mb4 组编码与 MySQL Server 中定义的其他编码类型有什么区别。

使用 utf8mb4 而不是 utf8 有什么特别的好处/建议吗?

最佳答案

UTF-8是一种变长编码。对于 UTF-8,这意味着存储一个代码点需要一到四个字节。但是,MySQL 的编码称为“utf8”(“utf8mb3”的别名)每个代码点最多只能存储三个字节。

所以字符集“utf8”/“utf8mb3”不能存储所有的Unicode码位:它只支持0x000到0xFFFF的范围,称为“Basic Multilingual Plane”。另见 Comparison of Unicode encodings .

这是(同一页面的先前版本)the MySQL documentation不得不说:

The character set named utf8[/utf8mb3] uses a maximum of three bytes per character and contains only BMP characters. As of MySQL 5.5.3, the utf8mb4 character set uses a maximum of four bytes per character supports supplemental characters:

  • For a BMP character, utf8[/utf8mb3] and utf8mb4 have identical storage characteristics: same code values, same encoding, same length.

  • For a supplementary character, utf8[/utf8mb3] cannot store the character at all, while utf8mb4 requires four bytes to store it. Since utf8[/utf8mb3] cannot store the character at all, you do not have any supplementary characters in utf8[/utf8mb3] columns and you need not worry about converting characters or losing data when upgrading utf8[/utf8mb3] data from older versions of MySQL.

因此,如果您希望您的列支持存储位于 BMP 之外的字符(并且您通常希望这样做),例如 emoji ,使用“utf8mb4”。另见 What are the most common non-BMP Unicode characters in actual use? .

关于mysql - MySQL 中的 utf8mb4 和 utf8 字符集有什么区别?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30074492/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com