gpt4 book ai didi

python - 词典列表,如何获取 : intersection based on one value and symmetric difference based on another value

转载 作者:太空狗 更新时间:2023-10-30 02:27:23 25 4
gpt4 key购买 nike

假设我有:

dict_listA = [
{'id':0, 'b':1},
{'id':1, 'b':2},
{'id':2, 'b':3},
]

dict_listB = [
{'id':1, 'b':1},
{'id':2, 'b':3},
{'id':3, 'b':2},
]

我如何获得 id 的列表,其中我们有基于“id”的交集,但基于 b 的对称差异?

same_a_different_b = [
{'id':1, 'b':2},
]

目前这是我的解决方案:

for d1 in list_dictA:
same_a_different_b = filter(lambda d2: d2['id'] == d1['id'] and d2['b'] != d1['b'], list_dictB)

我问是因为这是目前我程序中最耗时的,我希望有某种方法可以更快地完成它。结果 (same_a_different_b) 通常为 0 或非常小,一个列表大约有 900 个条目,另一个大约有 1400 个条目。目前需要 9 秒。

最佳答案

试试这个:

hashed = {e['id']: e['b'] for e in dict_listB}
same_a_different_b2 = [e for e in dict_listA if e['id'] in hashed and hashed[e['id']] != e['b']]

我认为算法的复杂度等于 O(len(a) + len(b))。例如,在您的解决方案中,它等于 O(len(a) * len(b))。

如果列表可以有重复项:

hashed = defaultdict(set)
for e in dict_listB:
hashed[e['id']].add(e['b'])
same_a_different_b2 = [e for e in dict_listA if e['id'] in hashed and e['b'] not in hashed[e['id']]]

比较速度(len(a) == len(b) == 2000):

from collections import defaultdict

import time
from itertools import product

dict_listA = [
{'id': 0, 'b': 1},
{'id': 1, 'b': 2},
{'id': 2, 'b': 3},
*[{'id': i, 'b': 1} for i in range(10000, 10000 + 2000)]
]

dict_listB = [
{'id': 1, 'b': 1},
{'id': 2, 'b': 3},
{'id': 3, 'b': 2},
*[{'id': i, 'b': 1} for i in range(20000, 20000 + 2000)]
]

same_a_different_b = [
{'id': 1, 'b': 2},
]
start_time = time.clock()


def previous_solution():
new_same_a_different_b = []
for d1 in dict_listA:
new_same_a_different_b.extend(filter(lambda d2: d2['id'] == d1['id'] and d2['b'] != d1['b'], dict_listB))
return new_same_a_different_b


def new_solution():
hashed = {e['id']: e['b'] for e in dict_listB}
return [e for e in dict_listA if e['id'] in hashed and hashed[e['id']] != e['b']]


def other_solution():
return [d1 for d1, d2 in product(dict_listA, dict_listB) if d2['id'] == d1['id'] and d2['b'] != d1['b']]


for func, name in [
(previous_solution, 'previous_solution'),
(new_solution, 'new_solution'),
(other_solution, 'other_solution')
]:
start_time = time.clock()
new_result = func()
print('{:20}: {:.5f}'.format(name, time.clock() - start_time))
assert new_result, same_a_different_b

结果:

previous_solution   : 1.06517
new_solution : 0.00073
other_solution : 0.60582

关于python - 词典列表,如何获取 : intersection based on one value and symmetric difference based on another value,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41273234/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com