gpt4 book ai didi

python - 如何获得假人和groupby

转载 作者:行者123 更新时间:2023-11-28 22:30:01 24 4
gpt4 key购买 nike

下面有数据框

   Q  A
A a h
A b i
A c j
B d k
B a l
B b m
C c n

我想得到 dummy 和 groupby

    a   b   c   d    e   f   g
A h i j nan nan nan nan
B l nan nan nan k nan nan
C nan nan n nan nan nan nan

col=df.Q

我必须应用 get_dummiesgroupby。但我想不通。

我怎样才能得到这个结果?

最佳答案

看来你需要reset_indexpivot :

df = df.reset_index().pivot(index='index', columns='Q', values='A')
print (df)
Q a b c d
index
A h i j None
B l m None k
C None None n None

然后如果需要的话 reindex_axisreplace :

cols = list('abcdefg')
print (df.reindex_axis(cols, axis=1).replace({None:np.nan}))
Q a b c d e f g
index
A h i j NaN NaN NaN NaN
B l m NaN k NaN NaN NaN
C NaN NaN n NaN NaN NaN NaN

编辑:

如果数据中的重复更好是 groupbyjoin:

print (df)
Q A
A a h
A b i
A c j
B d k
B a l
B b m <-duplicates B b
B b t <-duplicates B b
C c n


df = df.reset_index().groupby(['index','Q'])['A'].apply(','.join).unstack()
print (df)
Q a b c d
index
A h i j None
B l m,t None k
C None None n None

另一种可能的解决方案 pivot_table :

#aggfunc='first' - get only first value, another values are lost
df1 = df.reset_index().pivot_table(index='index', columns='Q', values='A', aggfunc='first')
print (df1)
Q a b c d
index
A h i j None
B l m None k
C None None n None
Q a b c d

#aggfunc='sum' - summed data, no separator
df2 = df.reset_index().pivot_table(index='index', columns='Q', values='A', aggfunc='sum')
print (df2)
index
A h i j None
B l mt None k
C None None n None
Q a b c d

#aggfunc=','.join - summed data with separator
df3 = df.reset_index().pivot_table(index='index', columns='Q', values='A', aggfunc=','.join)
print (df3)
index
A h i j None
B l m,t None k
C None None n None

关于python - 如何获得假人和groupby,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42697390/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com