gpt4 book ai didi

python - Pandas groupby 具有自定义函数以数组形式返回列值

转载 作者:行者123 更新时间:2023-12-02 02:55:02 24 4
gpt4 key购买 nike

我一定做错了什么,但即使经过重大试验,我也无法弄清楚我做错了什么......

数据:

df = pd.DataFrame({
'ID': [3,3,3,2,2,2,1,1],
'X': [10,11,12,20,21,30,31,32],
'Y': [100,110,120,200,210,300,310,320]
})

# Outputs:
ID X Y
0 3 10 100
1 3 11 110
2 3 12 120
3 2 20 200
4 2 21 210
5 2 30 300
6 1 31 310
7 1 32 320

这是我的聚合函数。 (逗号分隔值工作正常)

def _colum_to_array(data):
# data['Xs'] = ",".join(str(d) for d in data['X']) # works
# data['Ys'] = ",".join(str(d) for d in data['Y']) # works

# Next two lines causes this: Length of values does not match length of index
# which kind of make sense.
# data['Xs'] = [data['X'].values]
# data['Ys'] = [data['Y'].values]

# but why is this not working
# np.tile is generating same number of array data
data['Xs'] = np.tile([data['X'].values], (data.shape[0], 1))
data['Ys'] = np.tile([data['Y'].values], (data.shape[0], 1))

return data

这就是我的分组方式:

df = df.groupby(['ID']).apply(_colum_to_array)

## Output is:
ID X Y Xs Ys
0 3 10 100 10 10
1 3 11 110 10 10
2 3 12 120 10 10
3 2 20 200 20 20
4 2 21 210 20 20
5 2 30 300 20 20
6 1 31 310 31 31
7 1 32 320 31 31

我期待或试图得到的是这样的东西。 X/Y 列的值被捕获为数组

   ID   X    Y  Xs          Ys
0 3 10 100 [10,11,12] [100,110,120]
1 3 11 110 [10,11,12] [100,110,120]
2 3 12 120 [10,11,12] [100,110,120]
3 2 20 200 [20,21,30] [200,210,300]
4 2 21 210 [20,21,30] [200,210,300]
5 2 30 300 [20,21,30] [200,210,300]
6 1 31 310 [31,32] [310,320]
7 1 32 320 [31,32] [310,320]

最佳答案

使用groupby.aggmerge如:

df_new = df.merge(df.groupby("ID", as_index=False).agg(list)\
.rename(columns={'X':'Xs','Y':'Ys'}))
#or with pandas 1.0.1 you can do
df_new = df.merge(df.groupby("ID").agg(Xs=('X',list), Ys=('Y',list)).reset_index())

print(df_new)
ID X Y Xs Ys
0 3 10 100 [10, 11, 12] [100, 110, 120]
1 3 11 110 [10, 11, 12] [100, 110, 120]
2 3 12 120 [10, 11, 12] [100, 110, 120]
3 2 20 200 [20, 21, 30] [200, 210, 300]
4 2 21 210 [20, 21, 30] [200, 210, 300]
5 2 30 300 [20, 21, 30] [200, 210, 300]
6 1 31 310 [31, 32] [310, 320]
7 1 32 320 [31, 32] [310, 320]

关于python - Pandas groupby 具有自定义函数以数组形式返回列值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61299310/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com