gpt4 book ai didi

Python 数据整理 : Loop through values in DataFrame and check if current iterator matches the former

转载 作者:太空宇宙 更新时间:2023-11-04 00:34:30 25 4
gpt4 key购买 nike

我陷入了数据争论问题。以下是我的数据:

Year = ['2010','2011','2012','2013','2014','2015','2010','2011','2014','2015','2016','2010','2011','2012','2015']
Type = ['WAS','WAS','BOS','BOS','WAS','BOS','BOS','BOS','WAS','WAS','BOS','BOS','BOS','BOS','BOS']
ID = ['a','a','a','a','a','a','b','b','b','b','b','c','c','c','c']
df = pd.DataFrame({'ID': ID,'Type': Type,'Year': Year})

df
a WAS 2010
a WAS 2011
a BOS 2012
a BOS 2013
a WAS 2014
and so on...............

我正在尝试完成两件事...首先 - 我想遍历数据框并为每一行检查 id 是否相同并确定前一个 Type 是否与当前迭代器类型匹配。然后,创建两个新的二进制变量 'WAStoBOS' 和 'BOStoWAS' 如果根本没有变化或者变化与变量名称不同则返回 0,如果变化是在变量的方向上则返回 1姓名。

例如,输出将是:

df
ID Type Year WAStoBOS BOStoWAS
a WAS 2010 0 0
a WAS 2011 0 0
a BOS 2012 1 0
a BOS 2013 0 0
a WAS 2014 0 1
a BOS 2015 1 0

第二个: 在相同的结构中,通过 ID,找到当前行年份和之前行年份之间的差异。

最终结果数据框将是:

    df
ID Type Year WAStoBOS BOStoWAS YearDiff
a WAS 2010 0 0 0
a WAS 2011 0 0 1
a BOS 2012 1 0 1
a BOS 2013 0 0 1
a WAS 2014 0 1 1
a BOS 2015 1 0 1
b BOS 2010 0 0 0
b BOS 2011 0 0 1
b WAS 2014 0 1 3
b WAS 2015 0 0 1
b BOS 2016 1 0 1
c BOS 2010 0 0 0
c BOS 2011 0 0 1
c BOS 2012 0 0 1
c BOS 2015 0 0 3

如有任何帮助,我们将不胜感激。


此修改是根据 Scotts 的建议进行的。

例如,您的代码错误地将 1 分配给 ID 和类型发生变化的实例。如果 ID 发生变化,我们不关心以前的类型是什么......我将稍微更改下面的数据框以说明 ID 和类型的变化,同时还显示所需的输出应该是什么......

        df
ID Type Year WAStoBOS BOStoWAS YearDiff
a WAS 2010 0 0 0
a WAS 2011 0 0 1
a BOS 2012 1 0 1
a BOS 2013 0 0 1
a WAS 2014 0 1 1
**a BOS 2015** 1 0 1
**b WAS 2010** 0 0 0
b BOS 2011 1 0 1
b WAS 2014 0 1 3
b WAS 2015 0 0 1
**b WAS 2016** 0 0 1
**c BOS 2010** 0 0 0
c BOS 2011 0 0 1
c BOS 2012 0 0 1
c BOS 2015 0 0 3

我在 ID 和 Type 有变化的实例旁边加了星号,供您引用。感谢您的帮助,我从未想过使用 assign。

最佳答案

编辑考虑分配带有“ID”的二进制文件:

df.assign(WAStoBOS=df.groupby('ID')['Type'].transform(lambda x: ((x == 'BOS') & (x.shift(1) == 'WAS')).astype(int)),
BOStoWAS=df.groupby('ID')['Type'].transform(lambda x: ((x == 'WAS') & (x.shift(1) == 'BOS')).astype(int)),
YearDiff=df.groupby('ID')['Year'].transform(lambda x: x.astype(int).diff().fillna(0)))

让我们在一条语句中做到这一点:

df.assign(WAStoBost=((df.Type == 'BOS') & (df.shift(1).Type == 'WAS')).astype(int),
BOStoWAS=((df.Type=='WAS')&(df.shift(1).Type == 'BOS')).astype(int),
YearDiff=df.groupby('ID')['Year'].transform(lambda x: x.astype(int).diff().fillna(0)))

输出:

   ID Type  Year  BOStoWAS  WAStoBost  YearDiff
0 a WAS 2010 0 0 0.0
1 a WAS 2011 0 0 1.0
2 a BOS 2012 0 1 1.0
3 a BOS 2013 0 0 1.0
4 a WAS 2014 1 0 1.0
5 a BOS 2015 0 1 1.0
6 b BOS 2010 0 0 0.0
7 b BOS 2011 0 0 1.0
8 b WAS 2014 1 0 3.0
9 b WAS 2015 0 0 1.0
10 b BOS 2016 0 1 1.0
11 c BOS 2010 0 0 0.0
12 c BOS 2011 0 0 1.0
13 c BOS 2012 0 0 1.0
14 c BOS 2015 0 0 3.0

关于Python 数据整理 : Loop through values in DataFrame and check if current iterator matches the former,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44660145/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com