gpt4 book ai didi

python - 查找两列之间相同行的数量

转载 作者:行者123 更新时间:2023-12-01 02:23:01 26 4
gpt4 key购买 nike

我有一个如下表:

             check_churn  is_churn
0 True 1
1 True 1
2 False 1
3 False 1
4 True 1
5 True 1
6 True 1
7 True 1
8 True 1
9 True 1
10 True 1

我想确定有多少行是相同的。例如,由于 True = 1,所以对第 0 行进行计数,也与第 1 行相同。因此答案将为 9,因为只有 2 个不匹配。

最佳答案

我相信您需要比较列和计数的值 Truesum 提供, True s 是类似 1 的进程s:

print ((df['check_churn'] == df['is_churn']))
0 True
1 True
2 False
3 False
4 True
5 True
6 True
7 True
8 True
9 True
10 True
dtype: bool


print ((df['check_churn'] == df['is_churn']).sum())
9

另一个解决方案是过滤并获取 DataFrame.shape :

print (df_train.loc[df_train.check_churn == df_train.is_churn].shape[0])
9

时间:

np.random.seed(2017)
N = 10000
df = pd.DataFrame({'check_churn':np.random.choice([True, False], size=N),
'is_churn':np.random.choice([0, 1], size=N)})
print (df)

In [35]: %timeit (df['check_churn'] == df['is_churn']).sum()
1000 loops, best of 3: 414 µs per loop

In [36]: %timeit sum(df['check_churn'] & df['is_churn'])
1000 loops, best of 3: 793 µs per loop

In [37]: %timeit (df.loc[df.check_churn == df.is_churn].shape[0])
1000 loops, best of 3: 708 µs per loop
<小时/>
N = 1000000

In [39]: %timeit (df['check_churn'] == df['is_churn']).sum()
100 loops, best of 3: 18.2 ms per loop

In [40]: %timeit sum(df['check_churn'] & df['is_churn'])
10 loops, best of 3: 54.7 ms per loop

In [41]: %timeit (df.loc[df.check_churn == df.is_churn].shape[0])
10 loops, best of 3: 23.4 ms per loop

In [42]: %timeit (df['check_churn'] & df['is_churn']).sum()
10 loops, best of 3: 21.2 ms per loop

关于python - 查找两列之间相同行的数量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47758720/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com