gpt4 book ai didi

python - 将列表列表替换为 "condensed"列表列表,同时保持顺序

转载 作者:IT老高 更新时间:2023-10-28 20:32:51 25 4
gpt4 key购买 nike

我有一个列表,如我附加的代码中所示。如果有任何共同值,我想链接每个子列表。然后我想用一个精简的列表列表替换列表列表。 示例:如果我有一个列表 [[1,2,3],[3,4]] 我想要 [1,2,3,4] 。如果我有 [[4,3],[1,2,3]] 我想要 [4,3,1,2]。如果我有 [[1,2,3],[a,b],[3,4],[b,c]] 我想要 [[1,2,3, 4],[a,b,c]][[a,b,c],[1,2,3,4]] 我不在乎哪个。

我快到了……

我的问题是当我遇到像 [[1,2,3],[10,5],[3,8,5]] 这样的案例时,我想要 [1, 2,3,10,5,8] 但使用我当前的代码,我得到 [1,2,3,8,10,5]

这是我的代码:

import itertools

a = [1,2,3]
b = [3,4]
i = [21,22]
c = [88,7,8]
e = [5,4]
d = [3, 50]
f = [8,9]
g= [9,10]
h = [20,21]

lst = [a,b,c,i,e,d,f,g,h,a,c,i]*1000
#I have a lot of list but not very many different lists

def any_overlap(a, b):
sb = set(b)
return any(itertools.imap(sb.__contains__, a))

def find_uniq(lst):
''' return the uniq parts of lst'''
seen = set()
seen_add = seen.add
return [ x for x in lst if x not in seen and not seen_add(x)]

def overlap_inlist(o_lst, lstoflst):
'''
Search for overlap, using "any_overlap", of a list( o_lst) in a list of lists (lstoflst).
If there is overlap add the uniq part of the found list to the search list, and keep
track of where that list was found
'''
used_lst =[ ]
n_lst =[ ]
for lst_num, each_lst in enumerate(lstoflst):
if any_overlap(o_lst,each_lst):
n_lst.extend(each_lst)
used_lst.append(lst_num)
n_lst= find_uniq(n_lst)
return n_lst, used_lst

def comb_list(lst):
'''
For each list in a list of list find all the overlaps using 'ovelap_inlist'.
Update the list each time to delete the found lists. Return the final combined list.
'''
for updated_lst in lst:
n_lst, used_lst = overlap_inlist(updated_lst,lst)
lst[:] = [x for i,x in enumerate(lst) if i not in used_lst]
lst.insert(0,n_lst)
return lst
comb_lst = comb_list(lst)
print comb_lst

这个脚本的输出是:

[[88, 7, 8, 9, 10], [1, 2, 3, 4, 50, 5], [21, 22, 20]]

我想要它,所以 key 按原始顺序排列,例如:

[[88, 7, 8, 9, 10], [1, 2, 3, 4, 5, 50,], [21, 22, 20]]

5和50切换在新的lst[2]

我对 python 有点陌生。我将不胜感激任何问题的解决方案或对我当前代码的评论。我不是计算机科学家,我想可能有某种算法可以快速做到这一点(也许来自集合论?)。如果有这样的算法,请指出我正确的方向。

我可能会让这种方式变得更复杂,然后......谢谢!!

最佳答案

这是一种蛮力方法(可能更容易理解):

from itertools import chain

def condense(*lists):
# remember original positions
positions = {}
for pos, item in enumerate(chain(*lists)):
if item not in positions:
positions[item] = pos

# condense disregarding order
sets = condense_sets(map(set, lists))

# restore order
result = [sorted(s, key=positions.get) for s in sets]
return result if len(result) != 1 else result[0]

def condense_sets(sets):
result = []
for candidate in sets:
for current in result:
if candidate & current: # found overlap
current |= candidate # combine (merge sets)

# new items from candidate may create an overlap
# between current set and the remaining result sets
result = condense_sets(result) # merge such sets
break
else: # no common elements found (or result is empty)
result.append(candidate)
return result

示例

>>> condense([1,2,3], [10,5], [3,8,5])
[1, 2, 3, 10, 5, 8]
>>> a = [1,2,3]
>>> b = [3,4]
>>> i = [21,22]
>>> c = [88,7,8]
>>> e = [5,4]
>>> d = [3, 50]
>>> f = [8,9]
>>> g= [9,10]
>>> h = [20,21]
>>> condense(*[a,b,c,i,e,d,f,g,h,a,c,i]*1000)
[[1, 2, 3, 4, 5, 50], [88, 7, 8, 9, 10], [21, 22, 20]]
>>> condense([1], [2, 3, 2])
[[1], [2, 3]]

性能对比

condense_*() 函数来自这个问题的答案。 lst_OP 来自问题的输入列表(不同大小),lst_BK - 来自 @Blckknght's answer 的测试列表(不同大小)。见 the source .

测量表明,基于“不相交集”和“无向图的连通分量”概念的解决方案在两种类型的输入上表现相似。

name                       time ratio comment
condense_hynekcer 5.79 msec 1.00 lst_OP
condense_hynekcer2 7.4 msec 1.28 lst_OP
condense_pillmuncher2 11.5 msec 1.99 lst_OP
condense_blckknght 16.8 msec 2.91 lst_OP
condense_jfs 26 msec 4.49 lst_OP
condense_BK 30.5 msec 5.28 lst_OP
condense_blckknght2 30.9 msec 5.34 lst_OP
condense_blckknght3 32.5 msec 5.62 lst_OP


name time ratio comment
condense_blckknght 964 usec 1.00 lst_BK
condense_blckknght2 1.41 msec 1.47 lst_BK
condense_blckknght3 1.46 msec 1.51 lst_BK
condense_hynekcer2 1.95 msec 2.03 lst_BK
condense_pillmuncher2 2.1 msec 2.18 lst_BK
condense_hynekcer 2.12 msec 2.20 lst_BK
condense_BK 3.39 msec 3.52 lst_BK
condense_jfs 544 msec 563.66 lst_BK


name time ratio comment
condense_hynekcer 6.86 msec 1.00 lst_rnd
condense_jfs 16.8 msec 2.44 lst_rnd
condense_blckknght 28.6 msec 4.16 lst_rnd
condense_blckknght2 56.1 msec 8.18 lst_rnd
condense_blckknght3 56.3 msec 8.21 lst_rnd
condense_BK 70.2 msec 10.23 lst_rnd
condense_pillmuncher2 324 msec 47.22 lst_rnd
condense_hynekcer2 334 msec 48.61 lst_rnd

重现结果clone gist并运行 measure-performance-condense-lists.py

关于python - 将列表列表替换为 "condensed"列表列表,同时保持顺序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13714755/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com