gpt4 book ai didi

python - Pandas :将两行彼此分开

转载 作者:行者123 更新时间:2023-11-28 21:51:48 25 4
gpt4 key购买 nike

这是我的数据框中的两行:

>>> test.loc[test.index.year == 2009]
0 1 2 3 4 \
date
2009-01-01 252.855283 353.6261 556.295659 439.558188 432.936844

5 6 employment
date
2009-01-01 439.437132 433.269903 64.116667

>>> test.loc[test.index.year == 2007]
0 1 2 3 4 \
date
2007-01-01 269.277757 380.608002 401.765546 491.893821 433.864499

5 6 employment
date
2007-01-01 492.396073 489.260588 69.1375

当我尝试divide时,我得到了

>>> test.loc[test.index.year == 2009].divide(test.loc[test.index.year == 2007])
0 1 2 3 4 5 6 employment
date
2007-01-01 NaN NaN NaN NaN NaN NaN NaN NaN
2009-01-01 NaN NaN NaN NaN NaN NaN NaN NaN

这来自 pandas 试图划分比较索引的列。但是,axis= 中的选项都没有帮助我。我可以得到正确的结果

test.loc[test.index.year == 2009].values / test.loc[test.index.year == 2007].values
array([[ 0.93901288, 0.92910842, 1.38462759, 0.8936038 , 0.99786188,
0.89244646, 0.88556061, 0.92737902]])

没有更好的方法吗?我想保留与记录相对应的索引 2007-01-01 - 当然,我可以将它重新附加到值,但通常当我尝试做这些事情时,我的方式,然后是正确的方式。那么:我还能做什么?

最佳答案

如果你想保留 2007 年的索引,我想你可以这样做:

df.loc[df.index.year == 2007]/df.loc[df.index.year == 2009].values

df.loc[df.index.year == 2007]/df.loc[df.index.year == 2009]df.loc[df. index.year == 2007].divide(df.loc[df.index.year == 2009]) 不起作用是因为 pandas 试图通过索引对齐数据。在这种情况下,将发生的情况是 2007 年的数据将除以索引值为 2007 年的数据(2009 年同样适用)。这就是为什么您得到的是 2 个,而不仅仅是 1 行 Nan

因此,我们需要将其中之一转换为它们各自的 np.array 才能使其正常工作。 (df.loc[df.index.year == 2007]/df.loc[df.index.year == 2009].values)。分子的索引,因为它没有被触及,所以被保留。

@EdChum,我不认为这是一个错误,我认为这是 bool 索引的预期行为,考虑到这一点:

df.iloc[df.index.year>=2007]/df.loc[df.index.year == 2007]
0 1 2 3 4 5 6 employment
date
2007-01-01 1 1 1 1 1 1 1 1
2009-01-01 NaN NaN NaN NaN NaN NaN NaN NaN

但是你应该小心这种方法,因为你可能从 bool 索引中得到不止一行,看这两个例子:

In [128]:

print df
0 1 2 3 4 \
2007-12-31 252.855283 353.626100 556.295659 439.558188 432.936844
2008-12-31 269.277757 380.608002 401.765546 491.893821 433.864499
2009-12-31 269.277757 380.608002 401.765546 491.893821 433.864499

5 6 7
2007-12-31 439.437132 433.269903 64.116667
2008-12-31 492.396073 489.260588 69.137500
2009-12-31 492.396073 489.260588 69.137500
In [130]:

print df.iloc[df.index.year==2007]/df.loc[df.index.year >= 2007]
#divide one row by 3 rows? Dimension mismatch? No, it will work just fine.
0 1 2 3 4 5 6 7
2007-12-31 1 1 1 1 1 1 1 1
2008-12-31 NaN NaN NaN NaN NaN NaN NaN NaN
2009-12-31 NaN NaN NaN NaN NaN NaN NaN NaN
In [131]:

df.iloc[df.index.year==2007]/df.loc[df.index.year >= 2007].values
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
**************
ValueError: Shape of passed values is (8, 3), indices imply (8, 1)
#basically won't work due to dimension mismatch

关于python - Pandas :将两行彼此分开,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29399464/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com