gpt4 book ai didi

python - 当存在 unicode 数据时,Json 解码器不一致

转载 作者:行者123 更新时间:2023-11-28 22:53:08 25 4
gpt4 key购买 nike

(这个问题与this one有关)

看看下面的 session :

Python 2.7.3 (default, Jan  2 2013, 16:53:07) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import simplejson as json
>>>
>>> my_json = '''[
... {
... "id" : "normal",
... "txt" : "This is a normal entry"
... },
... {
... "id" : "αβγδ",
... "txt" : "This is a unicode entry"
... }
... ]'''
>>>
>>> cache = json.loads(my_json, encoding='utf-8')
>>>
>>> cache
[{'txt': 'This is a normal entry', 'id': 'normal'}, {'txt': 'This is a unicode entry', 'id': u'\u03b1\u03b2\u03b3\u03b4'}]

为什么 json 解码器有时生成 unicode,有时生成纯字符串?它不是应该生成总是 unicode 吗?

最佳答案

貌似是simplejson中的优化,来自simplejson docs :

If s is a str then decoded JSON strings that contain only ASCII characters may be parsed as str for performance and memory reasons. If your code expects only unicode the appropriate solution is decode s to unicode prior to calling decode.

注意:ASCII 中包含的任何字符在 UTF-8 和 ASCII 中的编码都是相同的。所以 ASCII 是 UTF-8 的一个子集。

关于python - 当存在 unicode 数据时,Json 解码器不一致,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19701806/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com