gpt4 book ai didi

python - 将两个NumPy数组分组为列表的字典

转载 作者:行者123 更新时间:2023-12-03 15:53:17 28 4
gpt4 key购买 nike

我有两个大的NumPy数组,每个数组的形状为(519990),看起来像这样:

Order = array([0, 0, 0, 5, 6, 10, 14, 14, 14, 23, 23, 39]) 
Letters = array([A, B, C, D, E, F, G, H, I, J, K, L])
如您所见,第一个数组始终是递增的,并且是正数。我想将所有内容归类到订购信中,结果看起来像这样:
{0:[A,B,C], 5:[D], 6:[E], 10:[F], 14:[G, H, I], 23:[J, K], 39:[L]}
我要做的代码是:
df = pd.DataFrame()
df['order'] = Order
df['letters'] = Letters

linearDict = df.grouby('order').apply(lambda dfg:dfg.drop('order', axis=1).to_dict(orient='list')).to_dict()

endProduct = {}
for k, v in linearDict.items():
endProduct[k] = np.array(linearDict[k]['letter'][0:])

enProduct = {0:array([A,B,C]), 5:array([D]), 6:array([E]), 10:array([F]), 14:array([G, H, I]), 23:array([J, K]), 39:array([L])}
我的问题是这个过程变得缓慢。如此耗费系统资源,导致我的Jupyter Notebook崩溃。有更快的方法吗?

最佳答案

我们可以利用对Order进行排序的事实,在获得间隔索引后简单地将Letters切片,就像这样-

def numpy_slice(Order, Letters):
Order = np.asarray(Order)
Letters = np.asarray(Letters)
idx = np.flatnonzero(np.r_[True,Order[:-1]!=Order[1:],True])
return {Order[i]:Letters[i:j] for (i,j) in zip(idx[:-1],idx[1:])}
sample 运行-
In [66]: Order
Out[66]: array([16, 16, 16, 16, 23, 30, 33, 33, 39, 39, 39, 39, 39, 39, 39])

In [67]: Letters
Out[67]:
array(['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
'N', 'O'], dtype='<U1')

In [68]: numpy_slice(Order, Letters)
Out[68]:
{16: array(['A', 'B', 'C', 'D'], dtype='<U1'),
23: array(['E'], dtype='<U1'),
30: array(['F'], dtype='<U1'),
33: array(['G', 'H'], dtype='<U1'),
39: array(['I', 'J', 'K', 'L', 'M', 'N', 'O'], dtype='<U1')}

关于python - 将两个NumPy数组分组为列表的字典,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62539255/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com