gpt4 book ai didi

python - 使用 Pandas 读取 JSON 时出现“预期的字符串或 Unicode”

转载 作者:太空狗 更新时间:2023-10-29 22:11:14 31 4
gpt4 key购买 nike

我尝试阅读 Openstreetmaps API输出 JSON字符串,有效。

我正在使用以下代码:

import pandas as pd
import requests

# Links unten
minLat = 50.9549
minLon = 13.55232

# Rechts oben
maxLat = 51.1390
maxLon = 13.89873

osmrequest = {'data': '[out:json][timeout:25];(node["highway"="bus_stop"](%s,%s,%s,%s););out body;>;out skel qt;' % (minLat, minLon, maxLat, maxLon)}
osmurl = 'http://overpass-api.de/api/interpreter'
osm = requests.get(osmurl, params=osmrequest)

osmdata = osm.json()

osmdataframe = pd.read_json(osmdata)

抛出以下错误:

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-66-304b7fbfb645> in <module>()
----> 1 osmdataframe = pd.read_json(osmdata)

/Users/paul/anaconda/lib/python2.7/site-packages/pandas/io/json.pyc in read_json(path_or_buf, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, numpy, precise_float, date_unit)
196 obj = FrameParser(json, orient, dtype, convert_axes, convert_dates,
197 keep_default_dates, numpy, precise_float,
--> 198 date_unit).parse()
199
200 if typ == 'series' or obj is None:

/Users/paul/anaconda/lib/python2.7/site-packages/pandas/io/json.pyc in parse(self)
264
265 else:
--> 266 self._parse_no_numpy()
267
268 if self.obj is None:

/Users/paul/anaconda/lib/python2.7/site-packages/pandas/io/json.pyc in _parse_no_numpy(self)
481 if orient == "columns":
482 self.obj = DataFrame(
--> 483 loads(json, precise_float=self.precise_float), dtype=None)
484 elif orient == "split":
485 decoded = dict((str(k), v)

TypeError: Expected String or Unicode

如何修改请求或 Pandas read_json,以避免错误?顺便问一下,有什么问题吗?

最佳答案

如果将json字符串打印到文件中,

content = osm.read()
with open('/tmp/out', 'w') as f:
f.write(content)

你会看到这样的东西:

{
"version": 0.6,
"generator": "Overpass API",
"osm3s": {
"timestamp_osm_base": "2014-07-20T07:52:02Z",
"copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
},
"elements": [

{
"type": "node",
"id": 536694,
"lat": 50.9849256,
"lon": 13.6821776,
"tags": {
"highway": "bus_stop",
"name": "Niederhäslich Bergmannsweg"
}
},
...]}

如果要将 JSON 字符串转换为 Python 对象,它将是一个字典,其 elements 键是一个字典列表。绝大多数数据都在这个字典列表中。

此 JSON 字符串不能直接转换为 Pandas 对象。什么是索引,什么是列?您肯定不希望 [u'elements', u'version', u'osm3s', u'generator'] 成为列,因为几乎所有信息都在 中元素 字典列表。

但是,如果您希望 DataFrame 仅包含 elements 列表中的数据,那么您必须指定它,因为 Pandas 无法为您做出该假设.

更复杂的是 elements 中的每个字典都是一个嵌套字典。考虑 elements 中的第一个字典:

{
"type": "node",
"id": 536694,
"lat": 50.9849256,
"lon": 13.6821776,
"tags": {
"highway": "bus_stop",
"name": "Niederhäslich Bergmannsweg"
}
}

['lat', 'lon', 'type', 'id', 'tags'] 应该是列吗?这似乎是合理的,除了 tags 列最终会成为一列字典。这通常不是很有用。如果 tags 字典中的键被制成列,也许会更好。我们可以做到这一点,但我们必须自己编写代码,因为 Pandas 无法知道这就是我们想要的。


import pandas as pd
import requests
# Links unten
minLat = 50.9549
minLon = 13.55232

# Rechts oben
maxLat = 51.1390
maxLon = 13.89873

osmrequest = {'data': '[out:json][timeout:25];(node["highway"="bus_stop"](%s,%s,%s,%s););out body;>;out skel qt;' % (minLat, minLon, maxLat, maxLon)}
osmurl = 'http://overpass-api.de/api/interpreter'
osm = requests.get(osmurl, params=osmrequest)

osmdata = osm.json()
osmdata = osmdata['elements']
for dct in osmdata:
for key, val in dct['tags'].iteritems():
dct[key] = val
del dct['tags']

osmdataframe = pd.DataFrame(osmdata)
print(osmdataframe[['lat', 'lon', 'name']].head())

产量

         lat        lon                        name
0 50.984926 13.682178 Niederhäslich Bergmannsweg
1 51.123623 13.782789 Sagarder Weg
2 51.065752 13.895734 Weißig, Einkaufszentrum
3 51.007140 13.698498 Stuttgarter Straße
4 51.010199 13.701411 Heilbronner Straße

关于python - 使用 Pandas 读取 JSON 时出现“预期的字符串或 Unicode”,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24848416/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com