gpt4 book ai didi

python - 在 Pandas Python 中折叠列中的一些行

转载 作者:太空宇宙 更新时间:2023-11-03 23:54:19 27 4
gpt4 key购买 nike

我在 Pandas 中有这样一个 Dataframe:

            ID  rating     G1     G2     G3     G4  G5  G6  G7
0 1 2.5 18 0 0 0 0 0 0
1 4 4.0 18 0 0 0 0 0 0
2 7 3.0 78 1 0 0 0 0 0
3 1 4.0 21 7 8 10 30 40 20
4 21 3.0 18 0 0 0 0 0 0
5 7 2.0 18 80 10 11 8 0 0
6 41 3.5 18 0 9 10 0 0 0

我想按 ID 对所有元素进行分组,以便在 pandas 中获得一种连续数据帧,其中包含这样的行数组条目:

            ID    H1      H2                        
0 1 [2.5,18] [4.0,21,7,8,10,30,40,20]
1 4 [4.0,18] Nan
2 7 [3.0,78] [2.0, 18, 80, 10, 11,8]
3 21 [3.0,18] Nan
4 41 [3.5,18,76,9,10] Nan

你知道这是否可能吗?谢谢

最佳答案

使用:

#reshape by unstack per ID, concert series to one column DataFrame
df = df.set_index('ID').stack().to_frame('s')
#compare by 0
mask = df['s'].eq(0)
#helper column for consecutive 0 values
df['m'] = mask.groupby(level=0).cumsum()
#filter out 0 rows
df = df[~mask].reset_index()
#helper column for new columns names
df['g'] = df.groupby('ID')['m'].rank(method='dense').astype(int)
#create lists per groups, rehape and add prefix
df = (df.groupby(['ID','g'])['s'].apply(list)
.unstack()
.add_prefix('H')
.rename_axis(None, axis=1)
.reset_index())
print (df)
ID H1 H2
0 1 [2.5, 18.0] [4.0, 21.0, 7.0, 8.0, 10.0, 30.0, 40.0, 20.0]
1 4 [4.0, 18.0] NaN
2 7 [3.0, 78.0, 1.0] [2.0, 18.0, 80.0, 10.0, 11.0, 8.0]
3 21 [3.0, 18.0] NaN
4 41 [3.5, 18.0] [9.0, 10.0]

关于python - 在 Pandas Python 中折叠列中的一些行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58389724/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com