gpt4 book ai didi

c - fopen() 或 open() 使用什么字符编码?

转载 作者:太空狗 更新时间:2023-10-29 11:40:42 24 4
gpt4 key购买 nike

当您使用像fopen() 这样的函数时,您必须将文件名的字符串参数传递给它。我想知道这个字符串的字符编码应该是什么。

这个问题已经有人问过here , 但它有相互矛盾的答案。一个答案如下:

It depends on the system locale. Look at the output of the "locale" command. If the variables end in UTF-8, then your locale is UTF-8. Most modern linuxes will be using UTF-8. Although Andrew is correct that technically it's just a byte string, if you don't match the system locale some programs may not work correctly and it will be impossible to get correct user input, etc. It's best to stick with UTF-8.

虽然另一个答案是这样的:

Filesystem calls on Linux are encoding-agnostic, i.e. they do not (need to) know about the particular encoding. As far as they are concerned, the byte-string pointed to by the filename argument is passed down to the filesystem as-is. The filesystem expects that filenames are in the correct encoding (usually UTF-8, as mentioned by Matthew Talbert).

This means that you often don't need to do anything (filenames are treated as opaque byte-strings), but it really depends on where you receive the filename from, and whether you need to manipulate the filename in any way.

哪个答案是正确的?

最佳答案

他们在某些方面都是正确的。

传递给文件系统调用的字符串是一个字节串,空字节标记字符串的结尾,'/'用于分隔路径组件。在文件名段中,字节的含义对文件系统来说并不重要——它们只是一个字节序列。

如何显示构成文件名的字节取决于用于显示它们的设备。如果名称使用带有非 ASCII 字符的 UTF-8,使用 ISO 8859-15(或 8859-1 对于美国的顽固居民)打印该数据会产生乱码,通常包括来自字节范围 0x80 .. 0x9F 的 C1 控制字节。如果名称使用 8859-15 和非 ASCII 字符,将出现无效的 UTF-8 序列,您将看到难以辨认或无意义的数据显示(问号或其他无效 UTF-8 序列的指示)。

关于c - fopen() 或 open() 使用什么字符编码?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50790217/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com