gpt4 book ai didi

python - 将两个系列合并/压缩到 ndarray 的 ndarray

转载 作者:太空宇宙 更新时间:2023-11-04 00:04:33 30 4
gpt4 key购买 nike

我有两个长度相同的 pandas 系列,如下所示:

S1 = 
0 -0.483415
1 -0.514082
2 -0.515724
3 -0.519375
4 -0.505685
...

S2 =
1 -0.961871
2 -0.964762
3 -0.963798
4 -0.962112
5 -0.962028
...

我想将它们压缩到一个 numpy ndarray of ndarray 中,这样它看起来像这样:

<class 'numpy.ndarray'>
[[-0.483415 -0.961871]
[-0.514082 -0.964762]
[-0.515724 -0.963798]
...
]

如果我想要一个元组列表,我可以这样说:

v = list(zip(S1, S2))

这给了我:

<class 'list'>
[(-0.48341467662344273, -0.961871075696243),
(-0.5140815458448855, -0.9647615371349125),
...
]

我如何执行相同的“zip”但取回 ndarray of ndarray?我不想要循环。

最佳答案

Zip 这里不是必须的,为了更好的性能使用numpypandas:

arr = np.hstack((S1.values[:, None], S2.values[:, None]))

或者:

arr = np.vstack((S1, S2)).T

或者:

arr = pd.concat([S1.reset_index(drop=True), S2.reset_index(drop=True)], axis=1).values

或者:

arr = np.c_[S1, S2]

print (arr)
[[-0.483415 -0.961871]
[-0.514082 -0.964762]
[-0.515724 -0.963798]
[-0.519375 -0.962112]
[-0.505685 -0.962028]]

性能:

#50k values
S1 = pd.concat([S1] * 10000, ignore_index=True)
S2 = pd.concat([S2] * 10000, ignore_index=True)

In [107]: %timeit arr = np.hstack((S1.values[:, None], S2.values[:, None]))
133 µs ± 15.9 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [108]: %timeit arr = np.vstack((S1, S2)).T
176 µs ± 12 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [109]: %timeit arr = pd.concat([S1.reset_index(drop=True), S2.reset_index(drop=True)], axis=1).values
1.49 ms ± 74.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [110]: %timeit arr = np.c_[S1, S2]
320 µs ± 10.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [111]: %timeit np.array(list(zip(S1, S2)))
33 ms ± 545 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

关于python - 将两个系列合并/压缩到 ndarray 的 ndarray,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54608224/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com