gpt4 book ai didi

python - psycopg2.DataError : invalid byte sequence for encoding "UTF8": 0xa0

转载 作者:太空狗 更新时间:2023-10-29 21:37:16 26 4
gpt4 key购买 nike

我已经对这个错误进行了大量的谷歌搜索,并将其归结为我正在使用的数据库采用不同的编码这一事实。

我正在使用的 AIX 服务器正在运行

psql 8.2.4

server_encoding | LATIN1 | | Client Connection Defaults / Locale and Formatting | Sets the server (database) character set encoding.

我正在使用的 windows 2008 R2 服务器正在运行

psql (9.3.4)

CREATE DATABASE postgres
WITH OWNER = postgres
ENCODING = 'UTF8'
TABLESPACE = pg_default
LC_COLLATE = 'English_Australia.1252'
LC_CTYPE = 'English_Australia.1252'
CONNECTION LIMIT = -1;

COMMENT ON DATABASE postgres
IS 'default administrative connection database';

现在,当我尝试执行下面的 python 脚本时,出现此错误

Traceback (most recent call last):
File "datamain.py", line 39, in <module>
sys.exit(main())
File "datamain.py", line 33, in main
write_file_to_table("cms_jobdef.txt", "cms_jobdef", con_S104838)
File "datamain.py", line 21, in write_file_to_table
cur.copy_from(f, table, ",")
psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0xa0
CONTEXT: COPY cms_jobdef, line 15209

这是我的脚本

import psycopg2
import StringIO
import sys
import pdb

def connect_db(db, usr, pw, hst, prt):
conn = psycopg2.connect(database=db, user=usr,
password=pw, host=hst, port=prt)
return conn

def write_table_to_file(file, table, connection):
f = open(file, "w")
cur = connection.cursor()
cur.copy_to(f, table, ",")
f.close()
cur.close()

def write_file_to_table(file, table, connection):
f = open(file,"r")
cur = connection.cursor()
cur.copy_from(f, table, ",")
f.close()
cur.close()

def main():
login = open('login.txt','r')
con_tctmsv64 = connect_db("x", "y",
login.readline().strip(),
"d.domain", "c")
con_S104838 = connect_db("x", "y", "z", "a", "b")
try:
write_table_to_file("cms_jobdef.txt", "cms_jobdef", con_tctmsv64)
write_file_to_table("cms_jobdef.txt", "cms_jobdef", con_S104838)
finally:
con_tctmsv64.close()
con_S104838.close()

if __name__ == "__main__":
sys.exit(main())

已删除一些敏感数据。

所以我不确定我该如何继续。据我所知 copy_expert方法可能有助于导出为 UTF8 编码。但是因为我从中提取数据的服务器正在运行 8.2.4 我认为它不支持 COPY编码格式。

我认为我最好的办法是尝试在 Windows 服务器上重新安装编码为 LATIN1 的 postgre 数据库。当我尝试这样做时,出现以下错误。

psql error

所以我很困,任何帮助将不胜感激!

更新 我通过将默认本地设置更改为“C”,将 postgre 数据库安装在 Windows 上作为 LATIN1 编码。然而,这给了我以下错误,并且看起来不太可能是成功/正确的方法

enter image description here

我还尝试使用 PSQL COPY 将文件编码为 BINARY功能

def write_table_to_file(file, table, connection):
f = open(file, "w")
cur = connection.cursor()
#cur.copy_to(f, table, ",")
cur.copy_expert("COPY cms_jobdef TO STDOUT WITH BINARY", f)
f.close()
cur.close()

def write_file_to_table(file, table, connection):
f = open(file,"r")
cur = connection.cursor()
#cur.copy_from(f, table)
cur.copy_expert("COPY cms_jobdef FROM STDOUT WITH BINARY", f)
f.close()
cur.close()

仍然没有运气,我得到了同样的错误

DataError: invalid byte sequence for encoding "UTF8": 0xa0
CONTEXT: COPY cms_jobdef, line 15209, column descript

关于菲尔斯的回答,我已经尝试过这种方法,但仍然没有成功。

import psycopg2
import StringIO
import sys
import pdb
import codecs

def connect_db(db, usr, pw, hst, prt):
conn = psycopg2.connect(database=db, user=usr,
password=pw, host=hst, port=prt)
return conn

def write_table_to_file(file, table, connection):
f = open(file, "w")
#fx = codecs.EncodedFile(f,"LATIN1", "UTF8")
cur = connection.cursor()
cur.execute("SHOW client_encoding;")
print cur.fetchone()
cur.copy_to(f, table)
#cur.copy_expert("COPY cms_jobdef TO STDOUT WITH BINARY", f)
f.close()
cur.close()

def write_file_to_table(file, table, connection):
f = open(file,"r")
cur = connection.cursor()
cur.execute("SET CLIENT_ENCODING TO 'LATIN1';")
cur.execute("SHOW client_encoding;")
print cur.fetchone()
cur.copy_from(f, table)
#cur.copy_expert("COPY cms_jobdef FROM STDOUT WITH BINARY", f)
f.close()
cur.close()

def main():
login = open('login.txt','r')
con_tctmsv64 = connect_db("x", "y",
login.readline().strip(),
"ctmtest1.int.corp.sun", "5436")
con_S104838 = connect_db("x", "y", "z", "t", "5432")
try:
write_table_to_file("cms_jobdef.txt", "cms_jobdef", con_tctmsv64)
write_file_to_table("cms_jobdef.txt", "cms_jobdef", con_S104838)
finally:
con_tctmsv64.close()
con_S104838.close()

if __name__ == "__main__":
sys.exit(main())

输出

In [4]: %run datamain.py
('sql_ascii',)
('LATIN1',)

In [5]:

这成功完成了,但是当我运行

select * from cms_jobdef;

新数据库中没有任何内容

enter image description here

我什至尝试过将文件格式从 LATIN1 转换为 UTF8。仍然没有运气

奇怪的是,当我仅使用 postgre COPY 手动执行此过程时它起作用的功能。我不知道为什么。再次感谢您的帮助。

最佳答案

原来有几个选项可以解决这个问题。

Phil 建议的更改客户端编码的选项确实有效。

cur.execute("SET CLIENT_ENCODING TO 'LATIN1';")

另一种选择是动态转换数据。我使用了一个名为 codecs 的 python 模块来执行此操作。

f = open(file, "w")
fx = codecs.EncodedFile(f,"LATIN1", "UTF8")
cur = connection.cursor()
cur.execute("SHOW client_encoding;")
print cur.fetchone()
cur.copy_to(fx, table)

关键是

fx = codecs.EncodedFile(f,"LATIN1", "UTF8")

我的主要问题是我没有将更改提交到数据库!傻我:)

关于python - psycopg2.DataError : invalid byte sequence for encoding "UTF8": 0xa0,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24176271/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com