gpt4 book ai didi

python - 通过从 json 创建新对象来消除嵌套

转载 作者:太空狗 更新时间:2023-10-29 20:17:55 28 4
gpt4 key购买 nike

我有一个标准的嵌套 json 文件,如下所示:它们是多层嵌套的,我必须通过创建新对象来消除所有嵌套。

嵌套的 json 文件。

{
"persons": [{
"id": "f4d322fa8f552",
"address": {
"building": "710",
"coord": "[123, 465]",
"street": "Avenue Road",
"zipcode": "12345"
},
"cuisine": "Chinese",
"grades": [{
"date": "2013-03-03T00:00:00.000Z",
"grade": "B",
"score": {
"x": 3,
"y": 2
}
}, {
"date": "2012-11-23T00:00:00.000Z",
"grade": "C",
"score": {
"x": 1,
"y": 22
}
}],
"name": "Shash"
}]
}

需要创建的新对象

persons 
[
{
"id": "f4d322fa8f552",
"cuisine": "Chinese",
"name": "Shash"
}
]

persons_address
[
{
"id": "f4d322fa8f552",
"building": "710",
"coord": "[123, 465]",
"street": "Avenue Road",
"zipcode": "12345"
}
]

persons_grade
[
{
"id": "f4d322fa8f552",
"__index": "0",
"date": "2013-03-03T00:00:00.000Z",
"grade": "B"
},
{
"id": "f4d322fa8f552",
"__index": "1",
"date": "2012-11-23T00:00:00.000Z",
"grade": "C"
},
]

persons_grade_score
[
{

"id": "f4d322fa8f552",
"__index": "0",
"x": "3",
"y": "2"

},
{

"id": "f4d322fa8f552",
"__index": "1",
"x": "1",
"y": "22"

},
]

我的方法:我使用规范化函数将所有列表制作成字典。添加了另一个函数,可以将 id 添加到所有嵌套的字典中。

现在我无法遍历每个级别并创建新对象。有什么办法可以做到这一点。

创建新对象后的整个想法我们可以将其加载到数据库中。

最佳答案

概念

这是一个通用的解决方案,可以满足您的需求。它使用的概念是递归循环顶级“persons”字典的所有值。根据它找到的每个值的类型,它继续进行。

因此,对于它在每个字典中找到的所有非字典/非列表,它会将它们放入您需要的顶级对象中。

或者如果它找到字典或列表,它会递归地再次做同样的事情,找到更多非字典/非列表或列表或字典。

此外,使用 collections.defaultdict 可以让我们轻松地将未知数量的每个键的列表填充到字典中,这样我们就可以获得您想要的那 4 个顶级对象。

代码示例

from collections import defaultdict

class DictFlattener(object):
def __init__(self, object_id_key, object_name):
"""Constructor.

:param object_id_key: String key that identifies each base object
:param object_name: String name given to the base object in data.

"""
self._object_id_key = object_id_key
self._object_name = object_name

# Store each of the top-level results lists.
self._collected_results = None

def parse(self, data):
"""Parse the given nested dictionary data into separate lists.

Each nested dictionary is transformed into its own list of objects,
associated with the original object via the object id.

:param data: Dictionary of data to parse.

:returns: Single dictionary containing the resulting lists of
objects, where each key is the object name combined with the
list name via an underscore.

"""

self._collected_results = defaultdict(list)

for value_to_parse in data[self._object_name]:
object_id = value_to_parse[self._object_id_key]
parsed_object = {}

for key, value in value_to_parse.items():
sub_object_name = self._object_name + "_" + key
parsed_value = self._parse_value(
value,
object_id,
sub_object_name,
)
if parsed_value:
parsed_object[key] = parsed_value

self._collected_results[self._object_name].append(parsed_object)

return self._collected_results

def _parse_value(self, value_to_parse, object_id, current_object_name, index=None):
"""Parse some value of an unknown type.

If it's a list or a dict, keep parsing, otherwise return it as-is.

:param value_to_parse: Value to parse
:param object_id: String id of the current top object being parsed.
:param current_object_name: Name of the current level being parsed.

:returns: None if value_to_parse is a dict or a list, otherwise returns
value_to_parse.

"""
if isinstance(value_to_parse, dict):
self._parse_dict(
value_to_parse,
object_id,
current_object_name,
index=index,
)
elif isinstance(value_to_parse, list):
self._parse_list(
value_to_parse,
object_id,
current_object_name,
)
else:
return value_to_parse

def _parse_dict(self, dict_to_parse, object_id, current_object_name,
index=None):
"""Parse some value of a dict type and store it in self._collected_results.

:param dict_to_parse: Dict to parse
:param object_id: String id of the current top object being parsed.
:param current_object_name: Name of the current level being parsed.

"""
parsed_dict = {
self._object_id_key: object_id,
}
if index is not None:
parsed_dict["__index"] = index

for key, value in dict_to_parse.items():
sub_object_name = current_object_name + "_" + key
parsed_value = self._parse_value(
value,
object_id,
sub_object_name,
index=index,
)
if parsed_value:
parsed_dict[key] = value

self._collected_results[current_object_name].append(parsed_dict)

def _parse_list(self, list_to_parse, object_id, current_object_name):
"""Parse some value of a list type and store it in self._collected_results.

:param list_to_parse: Dict to parse
:param object_id: String id of the current top object being parsed.
:param current_object_name: Name of the current level being parsed.

"""
for index, sub_dict in enumerate(list_to_parse):
self._parse_value(
sub_dict,
object_id,
current_object_name,
index=index,
)

然后使用它:

parser = DictFlattener("id", "persons")
results = parser.parse(test_data)

注意事项

  1. 您的示例数据与预期存在一些不一致,例如分数是字符串与整数。因此,当您将 given 与 expected 进行比较时,您需要调整这些。
  2. 总有更多的重构可以做,或者它可以变得更实用而不是成为一个类。但希望看到这个可以帮助您了解如何去做。
  3. 正如@jbernardo 所说,如果您要将它们插入到关系数据库中,它们不应该只将“id”作为键,它应该是“person_id”。

关于python - 通过从 json 创建新对象来消除嵌套,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51368630/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com