gpt4 book ai didi

python - 从数据框中提取共现数据

转载 作者:行者123 更新时间:2023-12-04 00:49:23 33 4
gpt4 key购买 nike

我有这样的东西:

 fromJobtitle         toJobtitle         size
0 CEO CEO 65
1 CEO Vice President 23
2 CEO Employee 56
3 Vice President CEO 112
4 Employee CEO 20

我想计算同时出现的次数,以便它结合两次出现(仅显示 2 之间有多少元素)

示例输出:

0              CEO     Vice President   135
1 CEO Employee 76
2 CEO CEO 65

最佳答案

import pandas as pd
df = pd.DataFrame({
'fromJobtitle': ['CEO', 'CEO', 'CEO', 'Vice President', 'Employee'],
'toJobtitle': ['CEO', 'Vice President', 'Employee', 'CEO', 'CEO'],
'size': [65, 23, 56, 112, 20]
})
df['combination'] = df.apply(lambda row: tuple(sorted([
row['fromJobtitle'],
row['toJobtitle']
])), axis=1)

然后:

df = df.groupby('combination').sum().reset_index()

结果:

    combination             size
0 (CEO, CEO) 65
1 (CEO, Employee) 76
2 (CEO, Vice President) 135

最后:

df['from'] = df.apply(lambda row: row['combination'][0], axis=1)
df['to'] = df.apply(lambda row: row['combination'][1], axis=1)
df = df.drop('combination', axis=1)
df.head()

结果:

    size    from    to
0 65 CEO CEO
1 76 CEO Employee
2 135 CEO Vice President

关于python - 从数据框中提取共现数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67822416/

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com