gpt4 book ai didi

Python ASCII 编解码器在写入 CSV 时无法编码字符错误

转载 作者:太空狗 更新时间:2023-10-29 20:37:51 25 4
gpt4 key购买 nike

我不完全确定我需要对这个错误做些什么。我认为这与需要添加 .encode('utf-8') 有关。但我不完全确定这是否是我需要做的,也不确定我应该在哪里应用它。

错误是:

line 40, in <module>
writer.writerows(list_of_rows)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 1
7: ordinal not in range(128)

这是我的 python 脚本的基础。

import csv
from BeautifulSoup import BeautifulSoup

url = \
'https://dummysite'

response = requests.get(url)

html = response.content

soup = BeautifulSoup(html)

table = soup.find('table', {'class': 'table'})

list_of_rows = []
for row in table.findAll('tr')[1:]:
list_of_cells = []
for cell in row.findAll('td'):
text = cell.text.replace('[','').replace(']','')
list_of_cells.append(text)
list_of_rows.append(list_of_cells)

outfile = open("./test.csv", "wb")
writer = csv.writer(outfile)
writer.writerow(["Name", "Location"])
writer.writerows(list_of_rows)

最佳答案

Python 2.x CSV 库已损坏。你有三个选择。按复杂程度排序:

  1. 编辑:见下文 使用固定库 https://github.com/jdunck/python-unicodecsv ( pip 安装 unicodecsv)。用作直接替代品 - 示例:

    with open("myfile.csv", 'rb') as my_file:    
    r = unicodecsv.DictReader(my_file, encoding='utf-8')
  2. <罢工>
<罢工>

<罢工>

  1. 阅读有关 Unicode 的 CSV 手册:https://docs.python.org/2/library/csv.html (见底部示例)

  2. 手动将每个项目编码为 UTF-8:

    for cell in row.findAll('td'):
    text = cell.text.replace('[','').replace(']','')
    list_of_cells.append(text.encode("utf-8"))

编辑,我发现 python-unicodecsv 在读取 UTF-16 时也损坏了。它提示任何 0x00 字节。

相反,使用 https://github.com/ryanhiebert/backports.csv ,它更类似于 Python 3 实现并使用 io 模块..

安装:

pip install backports.csv

用法:

from backports import csv
import io

with io.open(filename, encoding='utf-8') as f:
r = csv.reader(f):

关于Python ASCII 编解码器在写入 CSV 时无法编码字符错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32939771/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com