gpt4 book ai didi

python - 如何避免 unicodeError?

转载 作者:太空宇宙 更新时间:2023-11-04 08:13:57 24 4
gpt4 key购买 nike

我正在尝试写入文件,但出现以下错误:

Traceback (most recent call last):
File "/private/var/folders/jv/9_sy0bn10mbdft1bk9t14qz40000gn/T/Cleanup At Startup/merge-395780681.888.py", line 151, in <module>
gc_all_d.writerow(row)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/csv.py", line 148, in writerow
return self.writer.writerow(self._dict_to_list(rowdict))
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0329' in position 5: ordinal not in range(128)

在我尝试将顾问数据库中的一行写入汇总他们姓名的文件后发生错误:

# compile master spreadsheet
with(open('gc_all.txt_3','w')) as gc_all:
gc_all_d = csv.DictWriter(gc_all, fieldnames = fieldnames, extrasaction='ignore', delimiter = '\t')
gc_all_d.writeheader()
for row in aicep_l:
print row['name']
gc_all_d.writerow(row)
for row in nbcc_l:
gc_all_d.writerow(row)
print row['name']

我在陌生的水域。我在 writerow() 方法中没有看到可以将编码范围扩大到这个字符 '\u0329' 的参数。

我认为该错误可能与我使用 nameparser 模块将所有辅导员的姓名组织成相同格式有关。从 nameparser 导入的 HumanName 函数可能会用前导“u”写出辅导员的名字以表示 unicode,这意味着无法识别总输出 u'Sam the Man' 而不是 'Sam the Man'。

感谢您的帮助!


根据答案修改后出现错误:

  File "/private/var/folders/jv/9_sy0bn10mbdft1bk9t14qz40000gn/T/Cleanup At Startup/merge-395782963.700.py", line 153, in <module>
row['name'] = row['name'].encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcc in position 11: ordinal not in range(128)

使所有名称条目统一的代码:

# nbcc
with(open('/Users/samuelfinegold/Documents/noodle/gc/nbcc/nbcc_output.txt', 'rU')) as nbcc:
nbcc_d = csv.DictReader(nbcc, delimiter = '\t')
nbcc_l = []
for row in nbcc_d:
# name = HumanName(row['name'])
# row['name'] = name.title + ' ' + name.first + ' ' + name.middle + ' ' + name.last + ' ' + name.suffix
row['phone'] = row['phone'].translate(None, whitespace + punctuation)
nbcc_l.append(row)

修改后的代码:

# compile master spreadsheet
with(open('gc_all.txt_3','w')) as gc_all:
gc_all_d = csv.DictWriter(gc_all, fieldnames = fieldnames, extrasaction='ignore', delimiter = '\t')
gc_all_d.writeheader()
for row in nbcc_l:
row['name'] = row['name'].encode('utf-8')
gc_all_d.writerow(row)

错误:

Traceback (most recent call last):
File "/private/var/folders/jv/9_sy0bn10mbdft1bk9t14qz40000gn/T/Cleanup At Startup/merge-395784700.086.py", line 153, in <module>
row['name'] = row['name'].encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcc in position 11: ordinal not in range(128)
logout

最佳答案

来自docs :

This version of the csv module doesn’t support Unicode input. Also, there are currently some issues regarding ASCII NUL characters. Accordingly, all input should be UTF-8 or printable ASCII to be safe; see the examples in section Examples.

您需要在写入数据之前对其进行编码 - 例如:

for row in aicep_1:
print row['name']
for key, value in row.iteritems():
row[key] = value.encode('utf-8')
gc_all_d.writerow(row)

或者,由于您使用的是 2.7,您可以使用字典理解:

for row in aicep_1:
print row['name']
row = {key, value.encode('utf-8') for key, value in row.iteritems()}

或者在文档的示例页面上使用一些更复杂的模式。

关于python - 如何避免 unicodeError?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17708506/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com