python - "# -*- coding: utf-8 -*-"，"from __future__ import unicode_literals"和 "sys.setdefaultencoding("utf 8")"有什么区别-6ren

python - "# -- coding: utf-8 --"，"from future import unicode_literals"和 "sys.setdefaultencoding("utf 8")"有什么区别

转载作者：太空宇宙更新时间：2023-11-04 02:25:08

25

4

我所知道的是:

# -*- 编码:utf-8 -*-
它用于声明 Python 源文件的编码，一旦我设置了编码名称，Python 解析器将使用给定的编码解释文件。我称之为“文件编码”；
从 __future__ 导入 unicode_literals我正在使用 Python2.7 执行任务，我使用 from __future__ import unicode_literals 将字符串的默认类型从“str”更改为“unicode”。我称之为“字符串编码”；
sys.setdefaultencoding('utf8')但是有时候，我在Django中会报错，比如我在admin中存储了中文，然后我访问了相关页面

UnicodeEncodeError at /admin/blog/vulpaper/29/change/
'ascii' codec can't encode characters in position 6-13: ordinal not in range(128)
....the more error information
The string that could not be encoded/decoded was: emcms外贸网站管理系统

对于这个问题，我会在Django的设置文件中写sys.setdefaultencoding('utf8')来解决。

但实际上，我不知道上面的技术细节。

让我困惑的是:
1. 既然我设置了python源文件编码，为什么还要设置字符串编码来确保我的字符串编码是我喜欢的编码？
“文件编码”和“字符串编码”有什么区别？
2. 我设置了“文件编码”和“字符串编码”，为什么还是出现UnicodeEncodeError？

最佳答案

通常你必须同时使用文件编码和文字字符串编码，但它们实际上控制着一些非常不同的东西，了解区别。

文件编码

如果您希望在源代码中的任何地方(例如注释或文字字符串)写入 unicode 字符，您需要更改编码以使 python 解析器正常工作。设置错误的编码将导致 SyntaxError 异常。 PEP 263详细解释了问题以及如何控制解析器的编码。

In Python 2.1, Unicode literals can only be written using the Latin-1 based encoding "unicode-escape". This makes the programming environment rather unfriendly to Python users who live and work in non-Latin-1 locales such as many of the Asian countries.

...

Python will default to ASCII as standard encoding if no other encoding hints are given.

Unicode 文字字符串

Python 2 使用两种不同的字符串类型，unicode 和 str。当您定义文字字符串时，解释器实际上会创建一个 str 类型的新对象来保存该文字。

s = "A literal string"
print type(s)

<type 'str'>

TL;DR

If you want to change this behavior and instead create unicode object every time an unprefixed string literal is defined, you can use from __future__ import unicode_literals

如果您需要了解为什么这有用，请继续阅读。

您可以使用 u 前缀将文字字符串显式定义为 unicode。解释器将为这个文字创建一个 unicode 对象。

s = u"A literal string"
print type(s)

<type 'unicode'>

对于 ASCII 文本，使用 str 类型就足够了，但如果您打算操作非 ASCII 文本，则重要使用 unicode 类型字符级别的操作才能正常工作。以下示例显示了使用 str 和 unicode 对完全相同的文字进行字符级别解释的差异。

# -*- coding: utf-8 -*-

def print_characters(s):
    print "String of type {}".format(type(s))
    print "  Length: {} ".format(len(s))
    print "  Characters: " ,
    for c in s:
        print c,
    print
    print


u_lit = u"Γειά σου κόσμε"
s_lit = "Γειά σου κόσμε"

print_characters(u_lit)
print_characters(s_lit)

输出:

String of type <type 'unicode'>
  Length: 14 
  Characters:  Γ ε ι ά   σ ο υ   κ ό σ μ ε

String of type <type 'str'>
  Length: 26 
  Characters:  � � � � � � � �   � � � � � �   � � � � � � � � � �

使用 str 它错误地报告它是 26 字符长度并且遍历字符返回垃圾。另一方面，unicode 按预期工作。

设置 sys.setdefaultencoding('utf8')

有一个nice answer在堆栈溢出中关于为什么我们不应该使用它:)

关于python - "# -*- coding: utf-8 -*-"，"from __future__ import unicode_literals"和 "sys.setdefaultencoding("utf 8")"有什么区别，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50579547/

25

4

0

文章推荐： css - 水平对齐位置固定 Div

文章推荐： node.js - 将命令行参数传递给 node-webkit-builder 项目

文章推荐： javascript - 我如何构建这些回调来获取我想要的信息？

文章推荐： python - keras中的反卷积层

python - unicode_literals 是做什么用的？
我在 Python 中遇到了一个关于 __future__.unicode_literals 的奇怪问题。不导入 unicode_literals 我得到正确的输出: # encoding: utf-
python - 假设 unicode_literals，如何安全地评估文字的表示？
在 Python 2 中，我想评估一个包含文字表示的字符串。我想安全地执行此操作，所以我不想使用 eval()——相反，我已经习惯了使用 ast.literal_eval()的任务。但是，我还想在纯
python - unicode_literals 和 type()
我在 type() 调用中遇到支持 python2 和 python3 的问题。这说明了问题: from __future__ import unicode_literals name='FooCla
python - setup.py 包和 unicode_literals
我已经在 Py2.7 中创建了一个包，我正在尝试使其与 Py3 兼容。问题是如果我在 __init__.py 导入构建返回这个错误 error in daysgrounded setup comman
python - 如何在 unicode_literals 开启的情况下引用 pygame 颜色？
使用 unicode_literals 时使用 pygame.Color 名称的正确方法是什么？ Python 2.7.3 (v2.7.3:70274d53c1dd, Apr 9 2012, 20:
python - Django 迁移中的“from __future__ import unicode_literals”
我只是想知道为什么每个自动生成的 Django 迁移文件都包含以下行。 from __future__ import unicode_literals 即使我删除所有这些行，应用程序也运行正常。那么，
python - 如何修复将 Python 子进程迁移到 unicode_literals 的编码？
我们正准备迁移到 Python 3.4 并添加了 unicode_literals。我们的代码广泛依赖于使用 subprocess 模块的外部实用程序的管道。以下代码片段在 Python 2.7 上运
python - 我可以在主导入文件中使用 "from __future__ import unicode_literals"吗？
我正在创建一些必须在 2.6、2.7 和 3.3 下运行的演示 Python 脚本。作为其中的一部分，每个模块都带有前缀 from __future__ import unicode_literal
python - 使用 unicode_literals 在 Flask 应用程序中添加 header
使用 Nginx、uWSGI 和简单的 Flask 应用程序添加启用 unicode_literals 的 header 似乎会失败: # -*- coding: utf-8 -*- from __f
python - 在 python2.7 上使用 unicode_literals 和 __slots__
我正在尝试将我的 Python 2.7 程序转换为使用 from __future__ import unicode_literals 但是 pylint 对我大喊我不能将 unicode 字符串作为
python - 在 Python 2.6 中使用 unicode_literals 的任何陷阱？
我们已经让我们的代码库在 Python 2.6 下运行。为了准备 Python 3.0，我们开始添加: from __future__ import unicode_literals 到我们的 .py
python - 配置文件中的语法错误 - 未定义 future 功能 unicode_literals (Python、Django、Sphinx)
我正在尝试 Django。我本来打算阅读它的文档。它不在那里，我必须 build 它。阅读Django-1.5/doc文件夹中的Readme，下载Sphinx文档Python模块。使用 easy_in
python - 如何使用 unicode_literals 在 python 2 和 3 中获得兼容的 type() 行为？
这个问题看起来与 this one 惊人地相似，但是评论中的建议不起作用(不再？)，如下所示。我正在尝试编写一个 python2-3 兼容包，我的一个方法中有一个类生成器，type() 在 pyth
函数内的 Python 2.7 Unicode 错误(使用 __future__ print_function 和 unicode_literals)
我现在已经阅读了一些关于 unicode 的线程。我使用的是 Python 2.7.2，但使用的是 future 的 print_function(因为原始打印语句让我很困惑......) 下面是一
django - 在 Python 2.7 中使用 unicode_literals 时在 Django 中解码 utf-8
我正在使用 Django 来管理 Postgres 数据库。我在数据库中存储了一个代表西类牙(马拉加)城市的值。我的 Django 项目通过将 from __future__ import unico
python - "# -*- coding: utf-8 -*-"，"from __future__ import unicode_literals"和 "sys.setdefaultencoding("utf 8")"有什么区别
我所知道的是: # -*- 编码:utf-8 -*- 它用于声明 Python 源文件的编码，一旦我设置了编码名称，Python 解析器将使用给定的编码解释文件。我称之为“文件编码”；从 __fut
python-3.x - Python 2.7 和 Python 3.5 中的 unicode_literals 和 doctest
考虑以下演示脚本: # -*- coding: utf-8 -*- from __future__ import division from __future__ import unicode_lit

首页

博学

6Ren·AI

商城