gpt4 book ai didi

python - 将虚拟列添加到原始数据框

转载 作者:IT老高 更新时间:2023-10-28 22:11:35 25 4
gpt4 key购买 nike

我有一个如下所示的数据框:

             JOINED_CO GENDER    EXEC_FULLNAME  GVKEY  YEAR  CONAME  BECAMECEO  REJOIN   LEFTOFC    LEFTCO  RELEFT    REASON  PAGECO_PER_ROL                                                                                                                                     5622              NaN   MALE   Ira A. Eichner   1004  1992  AAR CORP   19550101     NaN  19961001  19990531     NaN  RESIGNED    795622              NaN   MALE   Ira A. Eichner   1004  1993  AAR CORP   19550101     NaN  19961001  19990531     NaN  RESIGNED    795622              NaN   MALE   Ira A. Eichner   1004  1994  AAR CORP   19550101     NaN  19961001  19990531     NaN  RESIGNED    795622              NaN   MALE   Ira A. Eichner   1004  1995  AAR CORP   19550101     NaN  19961001  19990531     NaN  RESIGNED    795622              NaN   MALE   Ira A. Eichner   1004  1996  AAR CORP   19550101     NaN  19961001  19990531     NaN  RESIGNED    795622              NaN   MALE   Ira A. Eichner   1004  1997  AAR CORP   19550101     NaN  19961001  19990531     NaN  RESIGNED    795622              NaN   MALE   Ira A. Eichner   1004  1998  AAR CORP   19550101     NaN  19961001  19990531     NaN  RESIGNED    795623              NaN   MALE  David P. Storch   1004  1992  AAR CORP   19961009     NaN       NaN       NaN     NaN       NaN    575623              NaN   MALE  David P. Storch   1004  1993  AAR CORP   19961009     NaN       NaN       NaN     NaN       NaN    575623              NaN   MALE  David P. Storch   1004  1994  AAR CORP   19961009     NaN       NaN       NaN     NaN       NaN    575623              NaN   MALE  David P. Storch   1004  1995  AAR CORP   19961009     NaN       NaN       NaN     NaN       NaN    575623              NaN   MALE  David P. Storch   1004  1996  AAR CORP   19961009     NaN       NaN       NaN     NaN       NaN    57

对于 YEAR 值,我喜欢在原始数据框中添加年份列 (1993,1994...,2009),如果 YEAR 中的值为 1992,则 1992 列中的值应为 1,否则为 0。

我使用了一个非常愚蠢的 for 循环,但它似乎永远运行,因为我有一个大数据集。谁能帮帮我,非常感谢!

最佳答案

In [77]: df = pd.concat([df, pd.get_dummies(df['YEAR'])], axis=1); df
Out[77]:
JOINED_CO GENDER EXEC_FULLNAME GVKEY YEAR CONAME BECAMECEO \
5622 NaN MALE Ira A. Eichner 1004 1992 AAR CORP 19550101
5622 NaN MALE Ira A. Eichner 1004 1993 AAR CORP 19550101
5622 NaN MALE Ira A. Eichner 1004 1994 AAR CORP 19550101
5622 NaN MALE Ira A. Eichner 1004 1995 AAR CORP 19550101
5622 NaN MALE Ira A. Eichner 1004 1996 AAR CORP 19550101
5622 NaN MALE Ira A. Eichner 1004 1997 AAR CORP 19550101
5622 NaN MALE Ira A. Eichner 1004 1998 AAR CORP 19550101
5623 NaN MALE David P. Storch 1004 1992 AAR CORP 19961009
5623 NaN MALE David P. Storch 1004 1993 AAR CORP 19961009
5623 NaN MALE David P. Storch 1004 1994 AAR CORP 19961009
5623 NaN MALE David P. Storch 1004 1995 AAR CORP 19961009
5623 NaN MALE David P. Storch 1004 1996 AAR CORP 19961009

REJOIN LEFTOFC LEFTCO RELEFT REASON PAGE 1992 1993 1994 \
5622 NaN 19961001 19990531 NaN RESIGNED 79 1 0 0
5622 NaN 19961001 19990531 NaN RESIGNED 79 0 1 0
5622 NaN 19961001 19990531 NaN RESIGNED 79 0 0 1
5622 NaN 19961001 19990531 NaN RESIGNED 79 0 0 0
5622 NaN 19961001 19990531 NaN RESIGNED 79 0 0 0
5622 NaN 19961001 19990531 NaN RESIGNED 79 0 0 0
5622 NaN 19961001 19990531 NaN RESIGNED 79 0 0 0
5623 NaN NaN NaN NaN NaN 57 1 0 0
5623 NaN NaN NaN NaN NaN 57 0 1 0
5623 NaN NaN NaN NaN NaN 57 0 0 1
5623 NaN NaN NaN NaN NaN 57 0 0 0
5623 NaN NaN NaN NaN NaN 57 0 0 0

1995 1996 1997 1998
5622 0 0 0 0
5622 0 0 0 0
5622 0 0 0 0
5622 1 0 0 0
5622 0 1 0 0
5622 0 0 1 0
5622 0 0 0 1
5623 0 0 0 0
5623 0 0 0 0
5623 0 0 0 0
5623 1 0 0 0
5623 0 1 0 0

如果您想删除 YEAR 列,则可以使用 del df['YEAR'] 进行后续操作。或者,在调用 concat 之前从 df 中删除 YEAR 列:

df = pd.concat([df.drop('YEAR', axis=1), pd.get_dummies(df['YEAR'])], axis=1)

关于python - 将虚拟列添加到原始数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23208745/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com