gpt4 book ai didi

python - Django 管理输入导致 UnicodeDecodeError,怎么办?

转载 作者:太空宇宙 更新时间:2023-11-04 01:33:14 27 4
gpt4 key购买 nike

今天我通过 Django 管理员收到无法编码的数据。不知何故,数据的编码不是unicode。这怎么可能?

我的 Client 模型有一个 name 属性,它以 unicode 格式返回数据:

@property
def name(self):
return u'{0} {1}'.format(self.firstname, self.lastname).strip()

但这不起作用:

>>> client
<Client: [Bad Unicode data]>

>>> client.lastname
'Dani\xc3\xabl'

>>> client.lastname.__class__
<type 'str'>

>>> u"{0} {1}".format(client.firstname, client.lastname)
Traceback (most recent call last):
File "<console>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)

足够奇怪,将名字/姓氏编码为常规字符串确实有效:

>>> "{0} {1}".format(client.firstname, client.lastname)
'Test Dani\xc3\xabl'

>>> "{0} {1}".format(client.firstname, client.lastname).decode('utf-8')
u'Test Dani\xebl'

这里发生了什么?这个输入是如何通过管理员进入我的模型的?

系统堆栈(它是一个外部服务器):

  • Debian 6.0.5(挤压)
  • Django 1.4.1
  • python 2.6.6
  • MySQL 5.1.49
  • MySQL-python==1.2.2

这是相关的模型代码:

class Client(models.Model):
firstname = models.CharField(_("Firstname"), max_length=255)
lastname = models.CharField(_("Lastname"), max_length=255)
email = models.EmailField(_("Email"), unique=True, max_length=255)

class Meta:
db_table = u'clients'
ordering = ('firstname', 'lastname', 'email')

def __unicode__(self):
return u'{0} <{1}>'.format(self.name, self.email)

@property
def name(self):
return u'{0} {1}'.format(self.firstname, self.lastname).strip()

最佳答案

这可能是由于您对 MySQL 数据库使用的排序规则所致。

事实上,Django 的行为是在从数据库中检索数据时始终返回 unicode 字符串 - 这将适用于您的代码,因为它没有任何问题。

但是,正如您在 the django documentation on database settings 中看到的那样, section collat​​ion settings, using MySQLdb version 1.2.2 with an utf8_bincollat​​ed MySQL database will cause you not get unicode strings, but bytestrings, 当从数据库中检索字符字段时。

您可能想调查这个问题(即检查您的 MySQL 排序规则设置),但您的问题很可能出自那里。

如果是这种情况,您将不得不手动解码从 MySQL 获得的任何输入。或者,您可以更改数据库的排序规则设置。

您可以使用 SHOW TABLE STATUS FROM %YOURDB% 来获取数据库中表的排序规则。


相关文档部分的摘录:

By default, with a UTF-8 database, MySQL will use the utf8_general_ci_swedish collation. This results in all string equality comparisons being done in a case-insensitive manner. That is, "Fred" and "freD" are considered equal at the database level. If you have a unique constraint on a field, it would be illegal to try to insert both "aa" and "AA" into the same column, since they compare as equal (and, hence, non-unique) with the default collation.

In many cases, this default will not be a problem. However, if you really want case-sensitive comparisons on a particular column or table, you would change the column or table to use the utf8_bin collation. The main thing to be aware of in this case is that if you are using MySQLdb 1.2.2, the database backend in Django will then return bytestrings (instead of unicode strings) for any character fields it receive from the database. This is a strong variation from Django's normal practice of always returning unicode strings.

关于python - Django 管理输入导致 UnicodeDecodeError,怎么办?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12227177/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com