gpt4 book ai didi

python - 在 csv dictwriter 中将 dict 值从 unicode 转换为 utf-8(或 ascii)

转载 作者:行者123 更新时间:2023-11-30 23:11:08 24 4
gpt4 key购买 nike

我正在尝试将一些数据打印到 csv 文件,但 unicode 破坏了我的氛围。

我的数据采用字典格式 - 这里是一个片段:

 {'category': u'Best food blog written by a linguist\xa0', 'runners_up': [], 'winner': [u'shesimmers.com'], 'category_url': 'http://www.chicagoreader.com/chicago/best-food-blog-written-by-a-linguist/BestOf?oid=4101663'}

这是我使用 DictWriter 方法的代码段。

    data = utf_8_encoder(data)
with open('best_food_n_drink.csv', 'w') as csvfile:
categories = ['category', 'category_url', 'winner', 'runners_up']
writer = csv.DictWriter(csvfile, delimiter =',', fieldnames=categories)
writer.writeheader()
for row in data:
writer.writerow(row)

utf_8_encoder 来 self 之前定义的函数:

  def utf_8_encoder(unicode_csv_data):
for line in unicode_csv_data:
line.encode('utf-8')
return unicode_csv_data

我不断收到类似 'dict' object has no attribute 'encode' 的错误消息。我尝试做一些类似放弃编码器功能并替换 row.values().encode('utf-8') 的事情在底部的 for 循环中,但这只是告诉我`列表对象没有属性'encode'。

我尝试替换 ('utf-8')('ascii', 'ignore')也一样,但就是无法弄清楚。

最佳答案

不确定您期望的输出格式,但这将对您的字符串进行编码:

def map_to(d):
# iterate over the key/values pairings
for k, v in d.items():
# if v is a list join and encode else just encode as it is a string
d[k] = ",".join(v).encode("utf-8") if isinstance(v, list) else v.encode("utf-8")



map_to(data)

with open('best_food_n_drink.csv', 'w') as csvfile:
categories = ['category', 'category_url', 'winner', 'runners_up']
writer = csv.DictWriter(csvfile, fieldnames=categories)
writer.writeheader()
writer.writerow(data)

这将输出类似以下内容的内容,但对于字符串和列表的混合,我真的不知道它最终应该是什么样子:

category,category_url,winner,runners_up
Best food blog written by a linguist ,http://www.chicagoreader.com/chicago/best-food-blog-written-by-a-linguist/BestOf?oid=4101663,shesimmers.com,

现在我们发现你实际上有一个列表,如果我们需要迭代列表,但逻辑仍然相同,我们只是在循环中的每个字典上运行该函数:

data = [{'category': u"Best restaurant that's been around forever and is still worth the trip\xa0", 'runners_up': [u'Frontera Grill', u'Chicago Diner ', u'Sabatino\u2019s', u'Twin Anchors'], 'winner': [u'Lula Cafe'], 'category_url': 'http://www.chicagoreader.com/chicago/BestOf?category=1979894&year=2011'},
{'category': u'Best bang for your buck\xa0', 'runners_up': [u'Frasca Pizzeria & Wine Bar', u'Chutney Joe\u2019s', u'"My boyfriend!"'], 'winner': [u'Big Star', u'Sultan\u2019s Market']}]

def map_to(d):
for k, v in d.items():
d[k] = ",".join(v).encode("utf-8") if isinstance(v, list) else v.encode("utf-8")

with open('best_food_n_drink.csv', 'w') as csvfile:
categories = ['category', 'category_url', 'winner', 'runners_up']
writer = csv.DictWriter(csvfile, fieldnames=categories)
writer.writeheader()
# get each dict from the list
for d in data:
# run the encode func
map_to(d)
writer.writerow(d)

我认为'category_url'实际上存在于第二个字典中。

要捕获 None 并避免编码错误,请在函数中添加一行:

def map_to(d):
for k, v in d.items():
# catch None's
if v is not None:
d[k] = " ".join(v).encode("utf-8") if isinstance(v, list) else v.encode("utf-8")

根据您计划对数据执行的操作,将数据存储为 json 可能会很有用:

import  json
with open('best_food_n_drink.js', 'w') as js:
json.dump(data,js)

然后获取列表如果数据:

import  json
with open('best_food_n_drink.json') as js:
data = json.load(js)

关于python - 在 csv dictwriter 中将 dict 值从 unicode 转换为 utf-8(或 ascii),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30312970/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com