gpt4 book ai didi

python - 统一码编码错误 : 'charmap' codec can't encode character '\u010d'

转载 作者:太空宇宙 更新时间:2023-11-04 04:03:31 30 4
gpt4 key购买 nike

我正在使用 Python 脚本打开 .csv 文件并将数据导入数据库。有几个导致错误的拉丁字符,所以我尝试用 UTF-8 对它们进行编码,并使用 errors='replace' 规范将那些麻烦的字符替换为问号。然而,即使在这样做之后,我仍然收到以下错误:

UnicodeEncodeError: 'charmap' codec can't encode character '\u010d' in position 2: character maps to <undefined>

我正在使用 Python 3.7.4。这是我当前的代码:

import csv
import cx_Oracle
import io

localfile = 'C:/User/Documents/Upload/data.csv'
connection = cx_Oracle.connect()

with io.open(localfile, 'r', encoding='utf-8', errors='replace') as csvfile:
for row in reader:
connection.execute("INSERT INTO database.my_table (Column_1, Column_2, Column_3) values (:1, :2, :3)", [
row[0], row[1], row[2]])
connection.execute('commit')
connection.execute('commit')

编辑:

这是完整的回溯

Traceback (most recent call last):
File "c:\User\.vscode\extensions\ms-python.python-2019.8.30787\pythonFiles\ptvsd_launcher.py", line 43, in <module>
main(ptvsdArgs)
File "c:\User\.vscode\extensions\ms-python.python-2019.8.30787\pythonFiles\lib\python\ptvsd\__main__.py", line 432, in main
run()
File "c:\User\.vscode\extensions\ms-python.python-2019.8.30787\pythonFiles\lib\python\ptvsd\__main__.py", line 316, in run_file
runpy.run_path(target, run_name='__main__')
File "C:\User\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\User\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\User\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "c:\User\Documents\Python_Projects\python_sftp_remote_server_edition.py", line 116, in <module>
insert(localfile, c)
File "c:\User\Documents\Python_Projects\python_sftp_remote_server_edition.py", line 28, in insert
row[0], row[1], row[2]])
File "C:\Users\AppData\Local\Programs\Python\Python37\lib\encodings\cp1252.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode character '\u010d' in position 2: character maps to <undefined>

最佳答案

正如回溯显示的那样,数据库希望接收 Windows 代码页 1252 中的输入。您可以尝试使用 errors='replace' 转换成这种编码,然后再转换回来:

item = item.encode('cp1252', errors='replace').decode('cp1252')

为了说明这一点,我们正在通过 CP1252 将 Unicode 字符串转换回 Unicode 并替换任何不能往返的字符 - 然后将结果传递给一个接口(interface),该接口(interface)将再次将其转换为 CP1252。这可以说一点也不优雅。

更好的策略是切换到可以正确处理 Unicode 的数据库。使用 errors='replace',您基本上是在要求计算机损坏有限的遗留目标字符编码无法处理的任何数据。

关于python - 统一码编码错误 : 'charmap' codec can't encode character '\u010d' ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57790451/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com