gpt4 book ai didi

python:检查一个numpy数组是否包含另一个数组的任何元素

转载 作者:太空狗 更新时间:2023-10-29 18:15:59 25 4
gpt4 key购买 nike

检查一个 numpy 数组是否包含另一个数组的任何元素的最佳方法是什么?

例子:

array1 = [10,5,4,13,10,1,1,22,7,3,15,9]
array2 = [3,4,9,10,13,15,16,18,19,20,21,22,23]`

如果 array1 包含 array2 的任何值,我想得到一个 True,否则一个 False

最佳答案

使用 Pandas,你可以使用 isin:

a1 = np.array([10,5,4,13,10,1,1,22,7,3,15,9])
a2 = np.array([3,4,9,10,13,15,16,18,19,20,21,22,23])

>>> pd.Series(a1).isin(a2).any()
True

并使用 in1d numpy 函数(根据@Norman 的评论):

>>> np.any(np.in1d(a1, a2))
True

对于本例中的小型数组,使用 set 的解决方案显然是赢家。对于更大的、不同的数组(即没有重叠),Pandas 和 Numpy 的解决方案更快。然而,np.intersect1d似乎更适合更大的阵列。

小数组(12-13 个元素)

%timeit set(array1) & set(array2)
The slowest run took 4.22 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 1.69 µs per loop

%timeit any(i in a1 for i in a2)
The slowest run took 12.29 times longer than the fastest. This could mean that an intermediate result is being cached
100000 loops, best of 3: 1.88 µs per loop

%timeit np.intersect1d(a1, a2)
The slowest run took 10.29 times longer than the fastest. This could mean that an intermediate result is being cached
100000 loops, best of 3: 15.6 µs per loop

%timeit np.any(np.in1d(a1, a2))
10000 loops, best of 3: 27.1 µs per loop

%timeit pd.Series(a1).isin(a2).any()
10000 loops, best of 3: 135 µs per loop

使用具有 100k 个元素的数组(无重叠):

a3 = np.random.randint(0, 100000, 100000)
a4 = a3 + 100000

%timeit np.intersect1d(a3, a4)
100 loops, best of 3: 13.8 ms per loop

%timeit pd.Series(a3).isin(a4).any()
100 loops, best of 3: 18.3 ms per loop

%timeit np.any(np.in1d(a3, a4))
100 loops, best of 3: 18.4 ms per loop

%timeit set(a3) & set(a4)
10 loops, best of 3: 23.6 ms per loop

%timeit any(i in a3 for i in a4)
1 loops, best of 3: 34.5 s per loop

关于python:检查一个numpy数组是否包含另一个数组的任何元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36190533/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com