gpt4 book ai didi

python - Pandas - 将两列转换为一个新列作为字典

转载 作者:太空宇宙 更新时间:2023-11-03 13:39:55 27 4
gpt4 key购买 nike

我正在尝试使用 Pandas 将两列转换为一个列,该列是两个转换后列的字典表示形式。

df = DataFrame({'Metrics' : [[("P", "P"), ("Q","Q")], ("K", "K"), ("Z", "Z")], 
'Stage_Name' : ["P", "K", "Z"],
'Block_Name' : ["A", "B", "A"]})

本质上,我想合并 MetricsStage_Name:

enter image description here

进入另一个名为 merged 的列,例如,第一行将是:

{'P': [('P', 'P'), ('Q', 'Q')]}

我知道如何将一行转换为字典表示,但是,我不确定如何在没有 for 循环的情况下对所有行执行此操作:

something = df.iloc[[0]].set_index('Stage_Name')['Metrics'].to_dict()
print something
Output: {'P': [('P', 'P'), ('Q', 'Q')]}

稍后我想根据 Block_Name 进行聚合,因此对于合并列,结果将是为 Block_Name 添加两个字典:A.

{'P': [('P', 'P'), ('Q', 'Q')], 'Z' : [('Z', 'Z')] }

对于 Stage_NameMetrics,我会将其附加到列表中,如下所示:

grouped = df.groupby(df['Block_Name'])
df_2 = grouped.aggregate(lambda x: tuple(x))

enter image description here

有人能指出我正确的方向吗?谢谢!

最佳答案

df['Merged'] = [{key: val} for key, val in zip(df.Stage_Name, df.Metrics)]

>>> df
Block_Name Metrics Stage_Name Merged
0 A [(P, P), (Q, Q)] P {u'P': [(u'P', u'P'), (u'Q', u'Q')]}
1 B (K, K) K {u'K': (u'K', u'K')}
2 A (Z, Z) Z {u'Z': (u'Z', u'Z')}

然后您的代码会产生所需的结果:

grouped = df.groupby(df['Block_Name'])
df_2 = grouped.aggregate(lambda x: tuple(x))[['Metrics', 'Stage_Name']]


>>> df_2
Metrics Stage_Name
Block_Name
A ([(P, P), (Q, Q)], (Z, Z)) (P, Z)
B ((K, K),) (K,)

时间:

%timeit df['Merged'] = [{key: val} for key, val in zip(df.Stage_Name, df.Metrics)]
10000 loops, best of 3: 162 µs per loop

%timeit df['merged'] = df.apply(lambda row: {row['Stage_Name']:row['Metrics']}, axis=1)
1000 loops, best of 3: 332 µs per loop

关于python - Pandas - 将两列转换为一个新列作为字典,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33378731/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com