python - locale.getpreferredencoding() - 为什么这会重置 string.letters？-6ren

python - locale.getpreferredencoding() - 为什么这会重置 string.letters？

转载作者：IT老高更新时间：2023-10-28 21:13:26

25

4

>>> import string
>>> import locale
>>> string.letters
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> locale.getpreferredencoding()
'UTF-8'
>>> string.letters
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'

有什么解决方法吗？

平台:LinuxPython2.6.7 和 Python2.7.3 似乎受到影响，在 Python3 中运行良好(使用 ascii_letters)

最佳答案

注意:OP 解决问题的方法是将 encoding='UTF-8' 传递给 open 调用。如果您遇到此问题并且只是在寻找解决方法，则此方法有效。这篇文章的其余部分强调为什么。

会发生什么

正如 Lukas 所说，文档指定:

On some systems, it is necessary to invoke setlocale() to obtain the user preferences

最初，string.letters 设置为返回 lowercase + uppercase:

lowercase = 'abcdefghijklmnopqrstuvwxyz'
uppercase = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
letters = lowercase + uppercase

但是，当您调用 getpreferredencoding() 时，_locale 模块会通过调用 PyDict_SetItemString(string, "letters", ulo); 来覆盖它； 在 fixup_ulcase(void) 内生成它们之后，使用以下内容:

/* create letters string */
n = 0;
for (c = 0; c < 256; c++) {
    if (isalpha(c))
        ul[n++] = c;
}
ulo = PyString_FromStringAndSize((const char *)ul, n);
if (!ulo)
    return;
if (string)
    PyDict_SetItemString(string, "letters", ulo);
Py_DECREF(ulo);

反过来，这在 PyLocale_setlocale 中调用，这确实是 setlocale，由 getpreferredencoding 调用 - 这里的代码 http://hg.python.org/cpython/file/07a6fca7ff42/Lib/locale.py#l612 :

  def getpreferredencoding(do_setlocale = True):
        """Return the charset that the user is likely using,
        according to the system configuration."""
        if do_setlocale:
            oldloc = setlocale(LC_CTYPE)
            try:
                setlocale(LC_CTYPE, "")
            except Error:
                pass
            result = nl_langinfo(CODESET)
            setlocale(LC_CTYPE, oldloc)
            return result
        else:
            return nl_langinfo(CODESET)

如何避免？

试试getpreferredencoding(False)

为什么在 Windows 中不会发生？

Windows 使用不同的代码来获取语言环境，如您所见 here .

在 Python 3 中

在 Python 3 中，getdefaultlocale 不接受 bool setlocale 变量，也不调用 setlocale 本身，如您所见 here .

关于python - locale.getpreferredencoding() - 为什么这会重置 string.letters？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/23743160/

25

4

0

文章推荐： python - Python 字典中的 "<"是什么意思？

文章推荐： java - 从java中的字符串数组中删除空值

文章推荐： java - 如何通过 Intent 共享多个文件？

文章推荐： python - celery 任务状态总是挂起

python - locale.getpreferredencoding() - 为什么这会重置 string.letters？
>>> import string >>> import locale >>> string.letters 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQR
python - sys.stdout.encoding、locale.getpreferredencoding() 和 sys.getdefaultencoding() 之间有什么区别？
我是 python 的新手，对这种编码的东西真的很困惑。到目前为止，我已经阅读了以下类型的“编码”: import sys import locale print (sys.stdout.encodi
python - 为什么 locale.getpreferredencoding() 返回 'ANSI_X3.4-1968' 而不是 'UTF-8' ？
每当我尝试使用 open(file_name, encoding='utf-8') 读取 UTF-8 编码的文本文件时，我总是会收到一条错误消息，提示 ASCII 编解码器无法解码某些字符(例如，当使

首页

博学

6Ren·AI

商城