gpt4 book ai didi

python - 如何转换存储为两列(开始、结束)的日期范围以创建新的行索引并为值创建累积率?

转载 作者:太空宇宙 更新时间:2023-11-04 03:26:59 25 4
gpt4 key购买 nike

我想知道如何转换存储为两列(开始、结束)的日期范围以创建新的行索引?例如我想转换以下数据:

    end         start     value
0 2000-01-04 2000-01-02 6
1 2000-01-05 2000-01-03 9

收件人:

date      rate
2000-01-02 2
2000-01-03 5
2000-01-04 5
2000-01-05 3

注意:

开始和结束显示范围,比率是在时间范围内分布的值,我正在寻找每天所有比率的总和

最佳答案

import pandas as pd
import numpy as np
import io

temp=u"""end,start,value
2000-01-04,2000-01-02,6
2000-01-05,2000-01-03,9"""

df = pd.read_csv(io.StringIO(temp), parse_dates = [0,1])
print df
#change ordering for filling date from start to end
df = df[['start', 'end', 'value']]

#value divided difference of start and end, but it cant count first day, so has to be added
df['value'] = df['value']/(df['end'] + pd.Timedelta('1 days')- df['start']).astype('timedelta64[D]')

df['Id'] = df.index
#reshape datetimes from rows to columns
df = pd.melt(df, id_vars=[ 'value','Id'], var_name=['D'], value_name='Date')
#remove unnecessary column D
del df['D']
print df
# value Id Date
#0 2 0 2000-01-02
#1 3 1 2000-01-03
#2 2 0 2000-01-04
#3 3 1 2000-01-05

#set multiindex
df = df.set_index(['Id', 'Date' ])

#fill gap between start and end dates
f = lambda df: df.asfreq("D", method='ffill')
df = df.reset_index(level=0).groupby('Id').apply(f)

del df['Id']
df = df.reset_index()
print df
# Id Date value
#0 0 2000-01-02 2
#1 0 2000-01-03 2
#2 0 2000-01-04 2
#3 1 2000-01-03 3
#4 1 2000-01-04 3
#5 1 2000-01-05 3

#sum column value to column rate
df['rate'] = df.groupby('Date')['value'].transform('sum')
#delete unnecessary columns
df = df.drop(['Id', 'value'], axis=1 )
#drop duplicity
df = df.drop_duplicates()
print df
#
# Date rate
#0 2000-01-02 2
#1 2000-01-03 5
#2 2000-01-04 5
#5 2000-01-05 3

关于python - 如何转换存储为两列(开始、结束)的日期范围以创建新的行索引并为值创建累积率?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32672675/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com