gpt4 book ai didi

python - Unicode解码错误: 'ascii' codec can't encode character u'\u2019'

转载 作者:行者123 更新时间:2023-11-30 23:28:43 25 4
gpt4 key购买 nike

我不断收到以下错误,并且似乎无法让 .encode('ascii',errors='ignore') 工作。

eqs = soup.find_all('div', {'style': 'margin:7px 5px 0px;vertical-align:top;text-align:center;display:inline-block;line-height:normal;width:120px;'})

for equipment in eqs:
if '#b0c3d9' in str(equipment):
f2.write(equipment.getText() + ', Common\n')
if '#5e98d9' in str(equipment):
f2.write(equipment.getText() + ', Uncommon\n')
if '#4b69ff' in str(equipment):
f2.write(equipment.getText() + ', Rare\n')
if '#8847ff' in str(equipment):
f2.write(equipment.getText() + ', Mythical\n')
if '#b28a33' in str(equipment):
f2.write(equipment.getText() + ', Immortal\n')
if '#d32ce6' in str(equipment):
f2.write(equipment.getText() + ', Legendary\n')
if '#eb4b4b' in str(equipment):
f2.write(equipment.getText() + ', Ancient\n')
if '#ade55c' in str(equipment):
f2.write(equipment.getText() + ', Arcana\n')

我已经尝试过:

f2.write(equipment.getText().encode('ascii', errors='ignore'))

f2.write(equipment.encode('ascii', errors='ignore').getText())

以及其他一些我羞于发布的内容。例如通过 BeautifulSoup 稍后读取的文件运行它,但这只会引发不同的错误。再次感谢您的帮助。

完整回溯:

Traceback (most recent call last):
File "<pyshell#285>", line 1, in <module>
import D2soup1
File "D2soup1.py", line 86, in <module>
test()
File "D2soup1.py", line 30, in test
f2.write(equipment.getText() + ', Immortal\n')
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 5: ordinal not in range(128)

我正在使用字符串从下面的 html 中解析出 box-shadow。我知道这可能不是最佳实践,但这是我能想到的捕获它的唯一方法。对 BeautifulSoup 来说还是个新手。

<div style="margin:7px 5px 0px;vertical-align:top;text-align:center;display:inline-block;line-height:normal;width:120px;"><div style="margin-bottom: 5px;box-shadow:0px 0px 2px 4px #5e98d9;"><a href="/Pirate_Slayer%27s_Tricorn" title="Pirate Slayer's Tricorn"><img alt="Pirate Slayer's Tricorn" src="http://hydra-media.cursecdn.com/dota2.gamepedia.com/thumb/7/79/Pirate_Slayer%27s_Tricorn.png/120px-Pirate_Slayer%27s_Tricorn.png" width="120" height="80" srcset="http://hydra-media.cursecdn.com/dota2.gamepedia.com/thumb/7/79/Pirate_Slayer%27s_Tricorn.png/180px-Pirate_Slayer%27s_Tricorn.png 1.5x, http://hydra-media.cursecdn.com/dota2.gamepedia.com/thumb/7/79/Pirate_Slayer%27s_Tricorn.png/240px-Pirate_Slayer%27s_Tricorn.png 2x"></a></div>

最佳答案

您使用的str(equipment)没有编解码器;您正在将 Tag 对象编码为 ASCII。

不要使用str;获取文本一次作为 unicode 值。并使用映射和循环来代替这么多 if 语句。

在这种情况下,您需要测试的只是 style 属性:

types = {
'#b0c3d9': 'Common',
'#5e98d9': 'Uncommon',
'#4b69ff':'Rare',
'#8847ff': 'Mythical',
'#b28a33': 'Immortal',
'#d32ce6': 'Legendary',
'#eb4b4b': 'Ancient',
'#ade55c': 'Arcana'
}

for equipment in eqs:
style = equipment.div.attrs.get('style', '')
textcontent = equipment.getText().encode('utf8')
for key in types:
if key in style:
f2.write('{}, {}'.format(textcontent, types[key])

然而,这些颜色代码最有可能位于equipment标签上的属性中;仅查看标记值,或使用 .find() 调用来缩小搜索范围。

关于python - Unicode解码错误: 'ascii' codec can't encode character u'\u2019',我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21488574/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com