gpt4 book ai didi

Python:Pandas 在 DataFrame 中生成向下填充变量

转载 作者:行者123 更新时间:2023-11-28 21:50:53 25 4
gpt4 key购买 nike

我有以下 DataFrame df:

                S   
2011-01-26 1
2011-01-27 0
2011-01-28 0
2011-01-29 0
2011-01-30 0
2011-01-31 0
2011-02-01 0
2011-02-02 0
2011-02-03 0
2011-02-04 0
2011-02-05 0
2011-02-06 0
2011-02-07 0
2011-02-08 0
2011-02-09 0

我正在尝试从 df 生成以下 DataFrame:

                S  S1 S2 S3   
2011-01-26 1 0 0 0
2011-01-27 0 1 0 0
2011-01-28 0 1 0 0
2011-01-29 0 0 1 0
2011-01-30 0 0 1 0
2011-01-31 0 0 1 0
2011-02-01 0 0 1 0
2011-02-02 0 0 0 1
2011-02-03 0 0 0 1
2011-02-04 0 0 0 1
2011-02-05 0 0 0 1
2011-02-06 0 0 0 1
2011-02-07 0 0 0 1
2011-02-08 0 0 0 1
2011-02-09 0 0 0 1

您可以看到每列中 1 的数量以 2 的倍数向下增加。Pandas 中是否有一个函数,例如 fillna 我可以为其指定向下填充 x 行?

更新事实上,我有一个更复杂的任务。

如果这是我的df:

                S   
2011-01-26 1
2011-01-27 0
2011-01-28 0
2011-01-29 0
2011-01-30 0
2011-01-31 0
2011-02-01 0
2011-02-02 0
2011-02-03 0
2011-02-04 0
2011-02-05 0
2011-02-06 0
2011-02-07 0
2011-02-08 0
2011-02-09 0
... (all zeros)
S
2011-04-26 1
2011-04-27 0
2011-04-28 0
2011-04-29 0
2011-04-30 0
2011-04-31 0
2011-05-01 0
2011-05-02 0
2011-05-03 0
2011-05-04 0
2011-05-05 0
2011-05-06 0
2011-05-07 0
2011-05-08 0
2011-05-09 0

我需要这个:

                S  S1 S2 S3   
2011-01-26 1 0 0 0
2011-01-27 0 1 0 0
2011-01-28 0 1 0 0
2011-01-29 0 0 1 0
2011-01-30 0 0 1 0
2011-01-31 0 0 1 0
2011-02-01 0 0 1 0
2011-02-02 0 0 0 1
2011-02-03 0 0 0 1
2011-02-04 0 0 0 1
2011-02-05 0 0 0 1
2011-02-06 0 0 0 1
2011-02-07 0 0 0 1
2011-02-08 0 0 0 1
2011-02-09 0 0 0 1
all zeros every where
S S1 S2 S3
2011-04-26 1 0 0 0
2011-04-27 0 1 0 0
2011-04-28 0 1 0 0
2011-04-29 0 0 1 0
2011-04-30 0 0 1 0
2011-04-31 0 0 1 0
2011-05-01 0 0 1 0
2011-05-02 0 0 0 1
2011-05-03 0 0 0 1
2011-05-04 0 0 0 1
2011-05-05 0 0 0 1
2011-05-06 0 0 0 1
2011-05-07 0 0 0 1
2011-05-08 0 0 0 1
2011-05-09 0 0 0 1

最佳答案

据我所知,没有现成的功能可以做到这一点。但是我们可以使用以下技巧来做类似的事情。

import pandas as pd
import numpy as np

# your data
# ========================================
df = pd.DataFrame(0, index=pd.date_range('2015-01-01', periods=100, freq='D'), columns=['col'])
df.iloc[[0, 71], 0] = 1

grouped = df.groupby(df.col.cumsum())

grouped.get_group(1)

Out[275]:
col
2015-01-01 1
2015-01-02 0
2015-01-03 0
2015-01-04 0
2015-01-05 0
2015-01-06 0
2015-01-07 0
2015-01-08 0
... ...
2015-03-05 0
2015-03-06 0
2015-03-07 0
2015-03-08 0
2015-03-09 0
2015-03-10 0
2015-03-11 0
2015-03-12 0

[71 rows x 1 columns]

grouped.get_group(2)

Out[276]:
col
2015-03-13 1
2015-03-14 0
2015-03-15 0
2015-03-16 0
2015-03-17 0
2015-03-18 0
2015-03-19 0
2015-03-20 0
... ...
2015-04-03 0
2015-04-04 0
2015-04-05 0
2015-04-06 0
2015-04-07 0
2015-04-08 0
2015-04-09 0
2015-04-10 0

[29 rows x 1 columns]

# processing
# ==================================

def func(group):
group['temp'] = 0
group.temp.iloc[2 ** np.arange(int(np.log2(len(group))) + 1) - 1] = 1
group['new_col'] = group.temp.cumsum()
return pd.get_dummies(group.new_col)


grouped.apply(func)

Out[281]:
1 2 3 4 5 6 7
2015-01-01 1 0 0 0 0 0 0
2015-01-02 0 1 0 0 0 0 0
2015-01-03 0 1 0 0 0 0 0
2015-01-04 0 0 1 0 0 0 0
2015-01-05 0 0 1 0 0 0 0
2015-01-06 0 0 1 0 0 0 0
2015-01-07 0 0 1 0 0 0 0
2015-01-08 0 0 0 1 0 0 0
... .. .. .. .. .. .. ..
2015-04-03 0 0 0 0 1 NaN NaN
2015-04-04 0 0 0 0 1 NaN NaN
2015-04-05 0 0 0 0 1 NaN NaN
2015-04-06 0 0 0 0 1 NaN NaN
2015-04-07 0 0 0 0 1 NaN NaN
2015-04-08 0 0 0 0 1 NaN NaN
2015-04-09 0 0 0 0 1 NaN NaN
2015-04-10 0 0 0 0 1 NaN NaN

关于Python:Pandas 在 DataFrame 中生成向下填充变量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31305769/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com