gpt4 book ai didi

pandas - 获取系列中小于或等于的条目数

转载 作者:行者123 更新时间:2023-12-04 13:19:30 24 4
gpt4 key购买 nike

我想获取小于或等于 pandas.Series 中每个条目的所有元素的计数,例如:

if __name__ == '__main__':
import pandas as pd
a = pd.Series(data=[4,7,3,5,2,1,1,6])
le = pd.Series(data=[a[a <= i].count() for i in a])
print(le)

结果:

0    5
1 8
2 4
3 6
4 3
5 2
6 2
7 7
dtype: int64

对于大型数据集,Series 中是否有函数或更好的方法?

最佳答案

更快的是 numpy 解决方案 - 将 Series 转换为 numpy array 并通过广播到 2d 数组进行比较,最后计算 True总和:

b = a.values
#pandas 0.24+
#b = a.to_numpy()
le = pd.Series((b <= b[:, None]).sum(axis=1), index=a.index)

详细信息:

print (b <= b[:, None])
[[ True False True False True True True False]
[ True True True True True True True True]
[False False True False True True True False]
[ True False True True True True True False]
[False False False False True True True False]
[False False False False False True True False]
[False False False False False True True False]
[ True False True True True True True True]]

le = pd.Series([a.le(i).sum() for i in a])

le = a.apply(lambda i: a.le(i).sum())

print(le)
0 5
1 8
2 4
3 6
4 3
5 2
6 2
7 7
dtype: int64

性能:

np.random.seed(2019)
N = 10**6
s = pd.Series(np.random.randint(100, size=N))
#print (s)

In [173]: %%timeit
...: b = a.values
...: le = pd.Series((b <= b[:, None]).sum(axis=1), index=a.index)
...:
78.6 µs ± 510 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [174]: %%timeit
...: le = pd.Series([a.le(i).sum() for i in a])
...:
3.22 ms ± 136 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [175]: %%timeit
...: le = a.apply(lambda i: a.le(i).sum())
...:
3.35 ms ± 290 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [176]: %%timeit
...: a.apply(lambda x: a[a.le(x)].count())
...:
...:
5.41 ms ± 457 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [177]: %%timeit
...: le = pd.Series(data=[a[a <= i].count() for i in a])
...:
4.91 ms ± 281 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

关于pandas - 获取系列中小于或等于的条目数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55570596/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com