gpt4 book ai didi

Python - 在 HTTP POST 请求中发送 unicode 字符(以\u 为前缀)

转载 作者:行者123 更新时间:2023-12-01 05:50:59 25 4
gpt4 key购买 nike

我正在编写一个程序来获取和编辑维基百科上的文章,但在处理以\u 为前缀的 Unicode 字符时遇到了一些麻烦。我已经尝试过 .encode("utf8") ,但它似乎在这里不起作用。如何正确编码这些以\u 为前缀的值以发布到维基百科?请参阅this edit对于我的问题。这是一些代码:获取页面:

url = "http://en.wikipedia.org/w/api.php?action=query&format=json&titles="+urllib.quote(name)+"&prop=revisions&rvprop=content"
articleContent = ClientCookie.urlopen(url).read().split('"*":"')[1].split('"}')[0].replace("\\n", "\n").decode("utf-8")

在发布页面之前:

data = dict([(key, value.encode('utf8')) for key, value in data.iteritems()])
data["text"] = data["text"].replace("\\", "")
editInfo = urllib2.Request("http://en.wikipedia.org/w/api.php", urllib.urlencode(data))

最佳答案

您正在下载 JSON 数据,但未对其进行解码。使用json library为此:

import json

articleContent = ClientCookie.urlopen(url)
data = json.load(articleContent)

JSON 编码数据看起来很像 Python,它也使用 \u 转义,但它实际上是 JavaScript 的子集。

data 变量现在拥有一个深层数据结构。从字符串分割来看,您想要这一段:

articleContent = data['query']['pages'].values()[0]['revisions'][0]['*']

现在articleContent是一个实际的unicode()实例;这是您要查找的页面的修订文本:

>>> print u'\n'.join(data['query']['pages'].values()[0]['revisions'][0]['*'].splitlines()[:20])
{{For|the game|100 Bullets (video game)}}
{{GOCEeffort}}
{{italic title}}
{{Supercbbox <!--Wikipedia:WikiProject Comics-->
| title =100 Bullets
| image =100Bullets vol1.jpg
| caption = Cover to ''100 Bullets'' vol. 1 "First Shot, Last Call". Cover art by Dave Johnson.
| schedule = Monthly
| format =
|complete=y
|Crime = y
| publisher = [[Vertigo (DC Comics)|Vertigo]]
| date = August [[1999 in comics|1999]] – April [[2009 in comics|2009]]
| issues = 100
| main_char_team = [[Agent Graves]] <br/> [[Mr. Shepherd]] <br/> The Minutemen <br/> [[List of characters in 100 Bullets#Dizzy Cordova (also known as "The Girl")|Dizzy Cordova]] <br/> [[List of characters in 100 Bullets#Loop Hughes (also known as "The Boy")|Loop Hughes]]
| writers = [[Brian Azzarello]]
| artists = [[Eduardo Risso]]<br>Dave Johnson
| pencillers =
| inkers =
| colorists = Grant Goleash<br>[[Patricia Mulvihill]]

关于Python - 在 HTTP POST 请求中发送 unicode 字符(以\u 为前缀),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14308424/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com