gpt4 book ai didi

python - 动态创建和重命名数据框

转载 作者:太空宇宙 更新时间:2023-11-03 15:53:59 26 4
gpt4 key购买 nike

我想通过 df1 和 df2 的重命名(和代码)来运行数据帧列表。这可以通过 def ....etc 或任何其他方法来完成吗?

df = pd.DataFrame( {
'A': ['d','d','d','d','d','d','g','g','g','g','g','g','k','k','k','k','k','k'],
'B': [5,5,6,4,5,6,-6,7,7,6,-7,7,-8,7,-6,6,-7,50],
'C': [1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2],
'S': [2012,2013,2014,2015,2016,2012,2012,2014,2015,2016,2012,2013,2012,2013,2014,2015,2016,2014]
} );

df = (df.B + df.C).groupby([df.A, df.S]).agg(['sum','size']).unstack(fill_value=0)
df1 = df.groupby(level=0, axis=1).sum()
new_cols= list(zip(df1.columns.get_level_values(0),['total'] * len(df.columns)))
df1.columns = pd.MultiIndex.from_tuples(new_cols)
df2 = pd.concat([df1,df], axis=1).sort_index(axis=1).sort_index(axis=1, level=1)
df2.columns = ['_'.join((col[0], str(col[1]))) for col in df2.columns]
df2.columns = df2.columns.str.replace('sum_','')
df2.columns = df2.columns.str.replace('size_','T')

最佳答案

我认为你可以使用自定义函数:

def func(df):
df = (df.B + df.C).groupby([df.A, df.S]).agg(['sum','size']).unstack(fill_value=0)
df1 = df.groupby(level=0, axis=1).sum()
new_cols= list(zip(df1.columns.get_level_values(0),['total'] * len(df.columns)))
df1.columns = pd.MultiIndex.from_tuples(new_cols)
df2 = pd.concat([df1,df], axis=1).sort_index(axis=1).sort_index(axis=1, level=1)
df2.columns = ['_'.join((col[0], str(col[1]))) for col in df2.columns]
df2.columns = df2.columns.str.replace('sum_','')
df2.columns = df2.columns.str.replace('size_','T')
return df2

print (func(df))
T2012 2012 T2013 2013 T2014 2014 T2015 2015 T2016 2016 Ttotal \
A
d 2 13 1 6 1 7 1 5 1 6 6
g 2 -11 1 8 1 8 1 8 1 7 6
k 1 -6 1 9 2 48 1 8 1 -5 6

total
A
d 37
g 20
k 54

如果需要处理多个数据帧:

for df in [df1,df2,df3]:
print (func(df))

如果需要输出到数据帧列表:

dfs = [func(df) for df in [df1,df2,df3]]

关于python - 动态创建和重命名数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40973037/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com