gpt4 book ai didi

python - 如何在Python中比较两个不同DataFrame的单元格值?

转载 作者:行者123 更新时间:2023-12-01 01:51:08 24 4
gpt4 key购买 nike

我有两个数据框:

Person_df

       

Name  Emplid  Country

    0  DK     123    India

    1  JS     456    India

    2  RM     789    China

    3  MS     111    China

    4  SR     222    China

 Target_df

    

Country Category    Target

    0 India Marketing Reduce spend by $xy.

    1 India R&D         Increase spend by $dd.

    2 India Infra     Reduce spend by $kn.

    3 China Marketing Increase spend by $eg.

    4 China R&D         Increase spend by $cb.

    5 China Infra     Reduce spend by $mn.

我的目标是根据每个人的国家/地区创建第三个 DataFrame,如下所示:

Individual_df

   

TargetID    Category    Target

    DK12301     Marketing Reduce spend by $xy.

    DK12302     R&D         Increase spend by $dd.

    DK12303     Infra     Reduce spend by $kn.

    JS45601     Marketing Reduce spend by $xy.

    JS45602     R&D         Increase spend by $dd.

    JS45603     Infra     Reduce spend by $kn.

    RM78901     Marketing Increase spend by $eg.

    RM78902     R&D         Increase spend by $cb.

    RM78903     Infra     Reduce spend by $mn.

    MS11101     Marketing Increase spend by $eg.

    MS11102     R&D         Increase spend by $cb.

    MS11103     Infra     Reduce spend by $mn.

    SR22201     Marketing Increase spend by $eg.

    SR22202     R&D         Increase spend by $cb.

    SR22203     Infra     Reduce spend by $mn.

基本上,我必须从 Person_df 中选取一个人,将他/她的国家/地区与 Target_df 中提到的国家/地区相匹配,然后将每个目标分配给这个人(并存储在 individual_df 中)。

问题是,我是 python 新手,无法真正弄清楚如何进行国家/地区比较。

我编写了以下代码:

   

for index, row in Person_df.iterrows():

     

        for index1, row1 in Goals_df.iterrows():

            If Person_df['country'] == Person_df['country'] : #I know this is incorrect

                data = [] 

                #populate data[] with selected values for one person.

                #append data[] to Individual_df

我需要以下几点帮助:

1)我真的可以在这里对每个人的国家/地区进行比较。

2)即使我知道如何比较,我编写的代码也效率不高,因为我在这里进行了大量不必要的迭代。有什么建议我可以如何改进吗?

谢谢!

最佳答案

试试这个,

Individual_df = pd.merge(Person_df, Target_df2, on=['Country'], how='left')
Individual_df['TargetID'] = Individual_df['Name'] + df3['Emplid'].astype(str) + ((df3.groupby('Emplid').cumcount() + 1).astype(str).str.zfill(2))
Individual_df = Individual_df[['TargetID', 'Category', 'Target']]
print Individual_df

输出:

   TargetID   Category                  Target
0 DK12301 Marketing Reduce spend by $xy.
1 DK12302 R&D Increase spend by $dd.
2 DK12303 Infra Reduce spend by $kn.
3 JS45601 Marketing Reduce spend by $xy.
4 JS45602 R&D Increase spend by $dd.
5 JS45603 Infra Reduce spend by $kn.
6 RM78901 Marketing Increase spend by $eg.
7 RM78902 R&D Increase spend by $cb.
8 RM78903 Infra Reduce spend by $mn.
9 MS11101 Marketing Increase spend by $eg.
10 MS11102 R&D Increase spend by $cb.
11 MS11103 Infra Reduce spend by $mn.
12 SR22201 Marketing Increase spend by $eg.
13 SR22202 R&D Increase spend by $cb.
14 SR22203 Infra Reduce spend by $mn.

说明:

  1. 与 Person_df 和 Target_df 执行左连接
  2. 然后根据姓名和员工 ID 以及员工 ID 的 cumcount 创建 TargetID
  3. 提取所需列

根据用户请求通过 for 循环获取行:

unique_countries=df1['Country'].unique().tolist()

for index, row in df2.iterrows():
if row['Country'] in unique_countries:
print row.values
//do operation

说明:

  1. 查找 Person_df 的唯一元素

  2. 通过for循环迭代Individual_df

  3. 检查国家是否存在于唯一元素(国家/地区)中如果存在,则执行所需的操作。

关于python - 如何在Python中比较两个不同DataFrame的单元格值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50716573/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com