gpt4 book ai didi

python - 如何合并行并将其转换为列

转载 作者:太空宇宙 更新时间:2023-11-03 15:56:46 24 4
gpt4 key购买 nike

我有一个数据框如下:

ID  START   END  SEQ
1 11 12 1
1 14 15 3
1 13 14 2
2 10 14 1
3 11 15 1
3 16 17 2

我需要将其转换为这个 DataFrame:

ID  START_1  END_1  SEQ_1  START_2  END_2  SEQ_2 START_3  END_3  SEQ_3
1 11 12 1 13 14 2 14 15 3
2 10 14 1 NA NA NA NA NA NA
3 11 15 1 16 17 2 NA NA NA

问题在于,相同ID的行数先验未知,这意味着最大列数 START_X, END_X, SEQ_X 不应手动定义。考虑到列应按 SEQ 排序,是否有任何自动化方法来执行此转换?我应该使用group_by还是应该采用哪种方法?

最佳答案

您可以使用groupbyunstack ,然后 sort_index最后通过列表理解从列中删除MultiIndex:

df['SEQ1'] = df.SEQ
df = df.groupby(['ID','SEQ1']).mean().unstack()
df = df.sort_index(axis=1, level=1)
df.columns = ['_'.join((col[0], str(col[1]))) for col in df.columns]
print (df)
START_1 END_1 SEQ_1 START_2 END_2 SEQ_2 START_3 END_3 SEQ_3
ID
1 11.0 12.0 1.0 13.0 14.0 2.0 14.0 15.0 3.0
2 10.0 14.0 1.0 NaN NaN NaN NaN NaN NaN
3 11.0 15.0 1.0 16.0 17.0 2.0 NaN NaN NaN

另一个解决方案 pivot_table , aggfunc='mean' 默认为:

df['SEQ1'] = df.SEQ
df = df.pivot_table(index= ['ID','SEQ1']).unstack()
df = df.sort_index(axis=1, level=1)
df.columns = ['_'.join((col[0], str(col[1]))) for col in df.columns]
print (df)
END_1 SEQ_1 START_1 END_2 SEQ_2 START_2 END_3 SEQ_3 START_3
ID
1 12.0 1.0 11.0 14.0 2.0 13.0 15.0 3.0 14.0
2 14.0 1.0 10.0 NaN NaN NaN NaN NaN NaN
3 15.0 1.0 11.0 17.0 2.0 16.0 NaN NaN NaN

关于python - 如何合并行并将其转换为列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40710303/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com