python - UnicodeEncodeError : 'ascii' codec can't encode character u'\xc7' in position 0: when writting to . CSV-6ren

python - UnicodeEncodeError : 'ascii' codec can't encode character u'\xc7' in position 0: when writting to . CSV

转载作者：太空宇宙更新时间：2023-11-04 07:10:36

24

4

我有这个代码:

#!/usr/local/bin/python
# -*- coding: utf-8 -*-

import re
import urllib2
import BeautifulSoup
import csv

origin_site = 'http://typo3.nimes.fr/index.php?id=annuaire_assos&theme=0&rech=&num_page='

get_url = re.compile(r"""window.open\('(.*)','','toolbar=0,""", re.DOTALL).findall

pages = range(1,2)

for page_no in pages:
    req = ('%s%s' % (origin_site, page_no))
    user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
    headers = { 'User-Agent' : user_agent }
    try:
        urllib2.urlopen(req)
    except urllib2.URLError, e:
        pass 
    else:
        # do something with the page
        doc = urllib2.urlopen(req)
        soup = BeautifulSoup.BeautifulSoup(doc)
        infoblock = soup.findAll('tr', { "class" : "menu2" })
        for item in infoblock:
            assoc_data = []
            soup = BeautifulSoup.BeautifulSoup(str(item))
            for tag in soup.recursiveChildGenerator():
                if isinstance(tag,BeautifulSoup.Tag) and tag.name in ('td'):
                    if tag.string is not None:
                        assoc_name = (tag.string)
                if isinstance(tag,BeautifulSoup.Tag) and tag.name in ('u'):
                    if tag.string is not None:
                        assoc_theme = (tag.string)

            get_onclick = str(soup('a')[0]['onclick']) # get the 'onclick' attribute
            url = get_url(get_onclick)[0]

            try:
                urllib2.urlopen(url)
            except urllib2.URLError, e:
                pass 
            else:
                assoc_page = urllib2.urlopen(url)
                #print assoc_page, url
                soup_page = BeautifulSoup.BeautifulSoup(assoc_page)
                assoc_desc = soup_page.find('table', { "bgcolor" : "#FFFFFF" })
                #print assoc_desc
                get_address = str(soup_page('td', { "class" : "menu2" }))
                soup_address = BeautifulSoup.BeautifulSoup(get_address)
                for tag in soup_address.recursiveChildGenerator():
                    if isinstance(tag,BeautifulSoup.Tag) and tag.name in ('a'):
                        if tag.string is not None:
                            assoc_email = (tag.string)
                assoc_data.append(assoc_theme)
                assoc_data.append(assoc_name)
                assoc_data.append(assoc_email)
                for tag in soup_address.recursiveChildGenerator():
                    if isinstance(tag,BeautifulSoup.Tag) and tag.name in ('td'):
                        if tag.string is not None:
                            if tag.string != '&nbsp;':
                                get_string = BeautifulSoup.BeautifulSoup(tag.string)
                                assoc_data.append(get_string)
                                #data.append(get_string)

            c = csv.writer(open("MYFILE.csv", "wb"))
            for item in assoc_data:
                c.writerow(item)

但是得到这个错误:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xc7' in position 0: ordinal not in range(128)

如何将法语字符传递到 MYFILE.csv 文件中？我可以进一步改进代码吗？

最佳答案

滚动到底部:http://docs.python.org/library/csv.html

具体来说，使用这个编写器:

class UnicodeWriter:
    """
    A CSV writer which will write rows to CSV file "f",
    which is encoded in the given encoding.
    """

    def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
        # Redirect output to a queue
        self.queue = cStringIO.StringIO()
        self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
        self.stream = f
        self.encoder = codecs.getincrementalencoder(encoding)()

    def writerow(self, row):
        self.writer.writerow([s.encode("utf-8") for s in row])
        # Fetch UTF-8 output from the queue ...
        data = self.queue.getvalue()
        data = data.decode("utf-8")
        # ... and reencode it into the target encoding
        data = self.encoder.encode(data)
        # write to the target stream
        self.stream.write(data)
        # empty queue
        self.queue.truncate(0)

    def writerows(self, rows):
        for row in rows:
            self.writerow(row)

然后，代替

c = csv.writer(open("MYFILE.csv", "wb"))

使用

c = UnicodeWriter(open("MYFILE.csv", "wb"))

关于python - UnicodeEncodeError : 'ascii' codec can't encode character u'\xc7' in position 0: when writting to . CSV，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/12886918/

24

4

0

文章推荐： java - 使用 spring 的 SimpleJdbcCall 执行存储函数不给出输出

文章推荐： java - Maven项目的变更

文章推荐： python - 这个功能的大O是什么？

django - 如何修复异常类型 : UnicodeEncodeError
我不确定为什么会收到此错误: Exception Type: UnicodeEncodeError Unicode error hint The string that could not be en
python - 搜索中文文本会抛出 UnicodeEncodeError
我正在使用python-twiter使用 Twitter 的 API 搜索推文，但我遇到中文术语问题。这是重现该问题的最小代码示例: # -*- coding: utf-8 -*- import tw
Python/Tweepy UnicodeEncodeError
我正在尝试使用 Twitter API 和 Python 来浏览 Twitter BIOS。但是我收到此错误: newFile.writerow(info) UnicodeEncodeError:
Python 网站抓取工具 UnicodeEncodeError
我正在使用 Requests 和 BeautifulSoup 以及 Python 3.4 从网站上抓取可能包含也可能不包含日语或其他特殊字符的信息。 def startThisPage(url):
python3 记录器 - UnicodeEncodeError
我有一个这样的记录器设置: import logging from logging.handlers import RotatingFileHandler import sys # root logg
python - UnicodeEncodeError 并将数据插入数据库
我有一个 Python 抓取器，它抓取一个网站并将数据插入 MySql 数据库。突然间我得到了一个错误 UnicodeEncodeError: 'latin-1' codec can't encode
Python:无法写入文件 - UnicodeEncodeError
此代码应将一些文本写入文件。当我尝试将文本写入控制台时，一切正常。但是当我尝试将文本写入文件时，出现 UnicodeEncodeError。我知道，这是一个常见问题，可以使用适当的解码或编码来解决，但
python - 修复由智能引号引起的 UnicodeEncodeError
我正在从事一个涉及自动生成文档(通过 latex )的项目。创建这些文档的人在 Windows 机器上工作(他以前使用 Microsoft word，但现在他在记事本中编辑它们)。无论如何，我注意到有
python - UnicodeEncodeError Python
当我尝试在 UTF-8 字符串中查找单词的计数时，我得到了下一个: UnicodeEncodeError UnicodeEncodeError: 'ascii' codec can't encode
Python 统一码 UnicodeEncodeError
我在尝试将 UTF-8 字符串转换为 unicode 时遇到问题。我收到错误。 UnicodeEncodeError: 'ascii' codec can't encode characters in
Python UnicodeEncodeError/维基百科API
我正在尝试用 Python 和 BeautifulSoup 解析这个文档: http://en.wikipedia.org/w/api.php?format=xml&action=opensearch
python - UnicodeEncodeError，需要修复
我正在尝试使用简单的 python print 语句。 print('这是') 但我遇到了这些问题。我正在使用Windows。原子IDE。 python 3.6.5问候，巴努。最佳答案将 # -
python - 保存到文件时出现 UnicodeEncodeError
无论我尝试什么解码和编码，我似乎都无法让它工作。我目前收到错误: UnicodeEncodeError: 'ascii' 编解码器无法对字符 u'\u2013' 进行编码但是如果我要添加解码和编码，
python - Django - UnicodeEncodeError
这个问题已经有答案了: Python: Unicode and ElementTree.parse (3 个回答) 已关闭 7 年前。在我的 Django 应用程序中，我使用 suds 库发出了肥皂
python - 读取文件时出现 UnicodeEncodeError
我正在尝试从 rockyou 单词列表中读取内容并将所有 >= 8 个字符的单词写入新文件。这是代码 - def main(): with open("rockyou.txt", encod
Python 写入文件时出现 UnicodeEncodeError
我正在使用“pdfminer.six”(一个 Python 库)从我拥有的几个 PDF 中提取所有文本。我的方法工作完美，但对于某些 pdf，可能有一些特殊字符，当我将其写入文本文件时，我收到“Uni
python - 如何重现 UnicodeEncodeError？
我在生产系统中遇到错误，但我无法在开发环境中重现该错误: with io.open(file_name, 'wt') as fd: fd.write(data) 异常(exception):
Python:从标准输入读取时出现 UnicodeEncodeError
当运行从标准输入读取的 Python 程序时，出现以下错误: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
python - 加入文件名时出现 UnicodeEncodeError
它在执行以下代码时抛出“UnicodeDecodeError:‘ascii’编解码器无法解码位置 2 中的字节 0xc2:序号不在范围内(128)”: filename = 'Spywaj.ttf'
python - 写入文件时出现 UnicodeEncodeError
我有一个 python 脚本，在我的本地机器 (OS X) 上运行良好，但是当我将它复制到服务器 (Debian) 时，它无法按预期运行。该脚本读取 xml 文件并以新格式打印内容。在我的本地机器上，

首页

博学

6Ren·AI

商城

python - UnicodeEncodeError : 'ascii' codec can't encode character u'\xc7' in position 0: when writting to . CSV