gpt4 book ai didi

indexing - 如何将函数应用于日期索引的 DataFrame

转载 作者:行者123 更新时间:2023-12-01 09:02:42 24 4
gpt4 key购买 nike

我在使用带有日期索引的 DataFrame 时遇到很多问题。

from pandas import DataFrame, date_range
# Create a dataframe with dates as your index
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
idx = date_range('1/1/2012', periods=10, freq='MS')
df = DataFrame(data, index=idx, columns=['Revenue'])
df['State'] = ['NY', 'NY', 'NY', 'NY', 'FL', 'FL', 'GA', 'GA', 'FL', 'FL']

In [6]: df
Out[6]:
Revenue State
2012-01-01 1 NY
2012-02-01 2 NY
2012-03-01 3 NY
2012-04-01 4 NY
2012-05-01 5 FL
2012-06-01 6 FL
2012-07-01 7 GA
2012-08-01 8 GA
2012-09-01 9 FL
2012-10-01 10 FL

我正在尝试使用组平均值添加一个名为 'Mean' 的附加列:

我试过了,但它不起作用:

df2 = df
df2['Mean'] = df.groupby(['State'])['Revenue'].apply(lambda x: mean(x))

In [9]: df2.head(10)
Out[9]:
Revenue State Mean
2012-01-01 1 NY NaN
2012-02-01 2 NY NaN
2012-03-01 3 NY NaN
2012-04-01 4 NY NaN
2012-05-01 5 FL NaN
2012-06-01 6 FL NaN
2012-07-01 7 GA NaN
2012-08-01 8 GA NaN
2012-09-01 9 FL NaN
2012-10-01 10 FL NaN

但我想得到:

       Revenue    State    Mean
2012-01-01 1 NY 2.5
2012-02-01 2 NY 2.5
2012-03-01 3 NY 2.5
2012-04-01 4 NY 2.5
2012-05-01 5 FL 7.5
2012-06-01 6 FL 7.5
2012-07-01 7 GA 7.5
2012-08-01 8 GA 7.5
2012-09-01 9 FL 7.5
2012-10-01 10 FL 7.5

我怎样才能得到这个DataFrame?

最佳答案

你几乎拥有它!首先创建 groupby 对象:

means = df.groupby('State').mean()

In [5]: means
Out[5]:
Revenue
State
FL 7.5
GA 7.5
NY 2.5

然后应用这个到DataFrame中的每个状态:

df['mean'] = df['State'].apply(lambda x: means.ix[x]['Revenue'])

In [7]: df
Out[7]:
Revenue State mean
2012-01-01 1 NY 2.5
2012-02-01 2 NY 2.5
2012-03-01 3 NY 2.5
2012-04-01 4 NY 2.5
2012-05-01 5 FL 7.5
2012-06-01 6 FL 7.5
2012-07-01 7 GA 7.5
2012-08-01 8 GA 7.5
2012-09-01 9 FL 7.5
2012-10-01 10 FL 7.5

关于indexing - 如何将函数应用于日期索引的 DataFrame,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13958129/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com