gpt4 book ai didi

unicode - Unicode 和 UTF-8 有什么区别?

转载 作者:行者123 更新时间:2023-12-03 04:10:38 24 4
gpt4 key购买 nike

考虑:

Alt text

unicode=utf16是真的吗? ?

许多人都说 Unicode 是一种标准,而不是一种编码,但实际上大多数编辑器都支持另存为 Unicode 编码

最佳答案

正如 Rasmus 在他的文章中所述 "The difference between UTF-8 and Unicode?" :

If asked the question, "What is the difference between UTF-8 andUnicode?", would you confidently reply with a short and preciseanswer? In these days of internationalization all developers should beable to do that. I suspect many of us do not understand these conceptsas well as we should. If you feel you belong to this group, you shouldread this ultra short introduction to character sets and encodings.

Actually, comparing UTF-8 and Unicode is like comparing apples andoranges:

UTF-8 is an encoding - Unicode is a characterset

A character set is a list of characters with unique numbers (thesenumbers are sometimes referred to as "code points"). For example, inthe Unicode character set, the number for A is 41.

An encoding on the other hand, is an algorithm that translates alist of numbers to binary so it can be stored on disk. For exampleUTF-8 would translate the number sequence 1, 2, 3, 4 like this:

00000001 00000010 00000011 00000100 

Our data is now translated into binary and can now be saved todisk.

All together now

Say an application reads the following from the disk:

1101000 1100101 1101100 1101100 1101111 

The app knows this data represent a Unicode string encoded withUTF-8 and must show this as text to the user. First step, is toconvert the binary data to numbers. The app uses the UTF-8 algorithmto decode the data. In this case, the decoder returns this:

104 101 108 108 111 

Since the app knows this is a Unicode string, it can assume eachnumber represents a character. We use the Unicode character set totranslate each number to a corresponding character. The resultingstring is "hello".

Conclusion

So when somebody asks you "What is the difference between UTF-8 andUnicode?", you can now confidently answer short and precise:

UTF-8 (Unicode Transformation Format) and Unicode cannot be compared. UTF-8 is an encodingused to translate numbers into binary data. Unicode is a character setused to translate characters into numbers.

关于unicode - Unicode 和 UTF-8 有什么区别?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3951722/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com