gpt4 book ai didi

python - 组合字典键的大量重叠间隔的值

转载 作者:太空宇宙 更新时间:2023-11-03 19:33:32 24 4
gpt4 key购买 nike

我有一本字典,里面有这样的项目

all={
1:{ ('a',123,145):20, ('a',155,170):12, ('b',234,345): 34},
2:{ ('a',121,135):10, ('a',155,175):28, ('b',230,345): 16},
3:{ ('a',130,140):20, ('a',150,170):10, ('b',234,345): 30},
...
n: {...}
}

编辑:字典名称是我根据读取初始数据的文件名任意给出的,我可以使用任何我想要命名这些字典的值。我想获得每个重叠区域的这些值的总和。显示重叠应该是什么样子的输出是这样的

 { ('a',121,122):10, ('a',123,130):30, ('a',131,135):50, 
('a',136,140):40,('a',141,145):20, ...}

编辑:每个字典都有不重叠的间隔,因此给定字典中永远不会有 ('a',2,10) 和 ('a',3,12),但字典之间的间隔作为开始和结束重叠位置不一样(即字典之间的键不一样)。

我不必使用字典数据结构,因为我首先创建了这个字典,如果使用列表、集合等更容易做到这一点,我可以在其中一个结构中获取数据,我也可以与基于不同数据结构的另一个解决方案一起使用。

感谢您的帮助。

最佳答案

好吧,现在我想我明白了:基本上你有一堆重叠的间隔,由特定位置处具有给定厚度的条形表示。您可以将这些条形图绘制在彼此下方,并查看它们在任何给定点处的厚度。

我认为滥用你有整数位置来做到这一点的事实是最简单/最快的:

all={
1:{ ('a',123,145):20, ('a',155,170):12, ('b',234,345): 34},
2:{ ('a',121,135):10, ('a',155,175):28, ('b',230,345): 16},
3:{ ('a',130,140):20, ('a',150,170):10, ('b',234,345): 30}
}

from collections import defaultdict
summer = defaultdict(int)
mini, maxi = 0,0
for d in all.values():
for (name, start, stop), value in d.iteritems():
# im completely ignoring the `name` here, not sure if that's what you want
# else just separate the data before doing this ...
if mini == 0:
mini = start
mini, maxi = min(mini, start), max(maxi, stop)
for i in range(start, stop+1):
summer[i]+=value

# now we have the values at each point, very redundant but very fast so far
print summer

# now we can find the intervals:
def get_intervals(points, start, stop):
cstart = start
for i in range(start, stop+1):
if points[cstart] != points[i]: # did the value change ?
yield cstart, i-1, points[cstart]
cstart = i

if cstart != i:
yield cstart, i, points[cstart]


print list(get_intervals(summer, mini, maxi))

仅使用它提供的“a”项时:

[(121, 122, 10), (123, 129, 30), (130, 135, 50), (136, 140, 40), (141, 145, 20), (146, 149, 0), (150, 154, 10), (155, 170, 50), (171, 175, 28)]

编辑:我突然想到如何做到这一点非常简单:

from collections import defaultdict
from heapq import heappush, heappop

class Summer(object):
def __init__(self):
# its a priority queue, kind of like a sorted list
self.hq = []

def additem(self, start, stop, value):
# at `start` add it as a positive value
heappush(self.hq, (start, value))
# at `stop` subtract that value again
heappush(self.hq, (stop, -value))

def intervals(self):
hq = self.hq
start, val = heappop(hq)
while hq:
point, value = heappop(hq)
yield start, point, val
# just maintain the current value and where the interval started
val += value
start = point
assert val == 0

summers = defaultdict(Summer)
for d in all.values():
for (name, start, stop), value in d.iteritems():
summers[name].additem(start, stop, value)

for name,s in summers.iteritems():
print name, list(s.intervals())

关于python - 组合字典键的大量重叠间隔的值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4620273/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com