gpt4 book ai didi

python - 合并列表中包含字典的多个字典

转载 作者:行者123 更新时间:2023-12-02 18:06:29 28 4
gpt4 key购买 nike

我有几个字典(可能有 10 个),其结构如下:

{'stdout': [{'foo': 'A', 'bar': 'B', 'host': None, 'count': 135},
{'foo': 'C', 'bar': 'B', 'host': 'egg', 'count': 28},
{'foo': 'D', 'bar': 'E', 'host': 'apple', 'count': 1},
{'foo': 'A', 'bar': 'E', 'host': 'chicken breast', 'count': 1},
{'foo': 'C', 'bar': 'F', 'host': 'carrot', 'count': 1}],
'stderr': ''}

我想通过添加“count”键的整数与相同的“foo”、“bar”和“host”键来组合所有这些字典(没有一个是 NoneType)

例如,对于 2 个字典

dictA = {'stdout': [{'foo': 'A', 'bar': 'B', 'host': None, 'count': 135},
{'foo': 'C', 'bar': 'B', 'host': 'egg', 'count': 28},
{'foo': 'D', 'bar': 'E', 'host': 'apple', 'count': 2},
{'foo': 'A', 'bar': 'E', 'host': 'chicken breast', 'count': 1},
{'foo': 'C', 'bar': 'F', 'host': 'carrot', 'count': 1}],
'stderr': ''}

dictB = {'stdout': [{'foo': 'A', 'bar': 'B', 'host': None, 'count': 280},
{'foo': 'A', 'bar': 'B', 'host': 'orange', 'count': 46},
{'foo': 'A', 'bar': 'E', 'host': 'pineapple', 'count': 3},
{'foo': 'D', 'bar': 'E', 'host': 'apple', 'count': 2},
{'foo': 'C', 'bar': 'F', 'host': 'carrot', 'count': 1}],
'stderr': ''}

那么合并后的版本应该是

dictMerged = {'stdout': [{'foo': 'A', 'bar': 'B', 'host': None, 'count': 415},
{'foo': 'A', 'bar': 'B', 'host': 'orange', 'count': 46},
{'foo': 'C', 'bar': 'B', 'host': 'egg', 'count': 28},
{'foo': 'D', 'bar': 'E', 'host': 'apple', 'count': 4},
{'foo': 'A', 'bar': 'E', 'host': 'pineapple', 'count': 3},
{'foo': 'C', 'bar': 'F', 'host': 'carrot', 'count': 2},
{'foo': 'A', 'bar': 'E', 'host': 'chicken breast', 'count': 1}],
'stderr': ''}

请注意,列表中字典元素的顺序在“count”求和后发生了变化。

我尝试将它们组合为相同的“主机”作为第一步,如下所示,但它与我想要的不一样:

hostname1 = {i["host"]: i for i in dictA['stdout']}
hostname2 = {i["host"]: i for i in dictB['stdout']}
all_host = hostname1|hostname2
{key: value + b[key] for key, value in a.items()}

最佳答案

一种方法

from collections import defaultdict
from operator import itemgetter

# creat a dictionary (defaultdict) to put the dictionaries with matching foo, bar, host in the same list
groups = defaultdict(list, {(d['foo'], d['bar'], d['host']): [d] for d in dictB['stdout']})
for d in dictA["stdout"]:
key = (d['foo'], d['bar'], d['host'])
groups[key].append(d)

# use item getter for better readability
count = itemgetter("count")

# create new list of dictionaries, sum the count values
ds = [{'foo': f, 'bar': b, 'host': h, 'count': sum(count(d) for d in v)} for (f, b, h), v in groups.items()]

# sort the list of dictionaries in decreasing order
res = {"stdout": sorted(ds, key=count, reverse=True), "stderr": ""}
print(res)

输出

{'stderr': '',
'stdout': [{'bar': 'B', 'count': 415, 'foo': 'A', 'host': None},
{'bar': 'B', 'count': 46, 'foo': 'A', 'host': 'orange'},
{'bar': 'B', 'count': 28, 'foo': 'C', 'host': 'egg'},
{'bar': 'E', 'count': 4, 'foo': 'D', 'host': 'apple'},
{'bar': 'E', 'count': 3, 'foo': 'A', 'host': 'pineapple'},
{'bar': 'F', 'count': 2, 'foo': 'C', 'host': 'carrot'},
{'bar': 'E', 'count': 1, 'foo': 'A', 'host': 'chicken breast'}]}

有关上面代码中使用的每个函数和数据结构的更多信息,请参阅:sorted , defaultdictitemgetter

一种替代方案

使用groupby :

import pprint
from operator import itemgetter
from itertools import groupby


def key(d):
return d["foo"], d["bar"], d["host"] or ""


count = itemgetter("count")
lst = sorted(dictA["stdout"] + dictB["stdout"], key=key)
groups = groupby(lst, key=key)
ds = [{'foo': f, 'bar': b, 'host': h or None, 'count': sum(count(d) for d in vs)} for (f, b, h), vs in groups]
res = {"stdout": sorted(ds, key=count, reverse=True), "stderr": ""}
print(res)

第二种方法有两个注意事项:

  1. 时间复杂度为 O(nlogn) 第一个为 O(n)
  2. 为了对字典列表进行排序,需要将 None 替换为空字符串 ""

多个词典

如果您有多个词典,您可以将第一种方法更改为:

# create a dictionary (defaultdict) to put the dictionaries with matching foo, bar, host in the same list
groups = defaultdict(list, {(d['foo'], d['bar'], d['host']): [d] for d in dictB['stdout']})

# create a list with all the dictionaries from multiple dict
data = []
lst = [dictA] # change this line to contain all the dictionaries except B
for d in lst:
data.extend(d["stdout"])

for d in data:
key = (d['foo'], d['bar'], d['host'])
groups[key].append(d)

# use item getter for better readability
count = itemgetter("count")

# create new list of dictionaries, sum the count values
ds = [{'foo': f, 'bar': b, 'host': h, 'count': sum(count(d) for d in v)} for (f, b, h), v in groups.items()]

# sort the list of dictionaries in decreasing order
res = {"stdout": sorted(ds, key=count, reverse=True), "stderr": ""}

什么是itemgetter

来自文档:

Return a callable object that fetches item from its operand using theoperand’s getitem() method. If multiple items are specified,returns a tuple of lookup values.

相当于:

def itemgetter(*items):
if len(items) == 1:
item = items[0]
def g(obj):
return obj[item]
else:
def g(obj):
return tuple(obj[item] for item in items)
return g

关于python - 合并列表中包含字典的多个字典,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73148327/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com