gpt4 book ai didi

python - 如何在Python中按天对时间序列数据求和? resample.sum() 没有效果

转载 作者:行者123 更新时间:2023-12-01 01:48:26 31 4
gpt4 key购买 nike

我是 Python 新手。如何根据日期对数据求和并绘制结果?

我有一个 Series 对象,其数据如下:

2017-11-03 07:30:00      NaN
2017-11-03 09:18:00 NaN
2017-11-03 10:00:00 NaN
2017-11-03 11:08:00 NaN
2017-11-03 14:39:00 NaN
2017-11-03 14:53:00 NaN
2017-11-03 15:00:00 NaN
2017-11-03 16:00:00 NaN
2017-11-03 17:03:00 NaN
2017-11-03 17:42:00 800.0
2017-11-04 07:27:00 600.0
2017-11-04 10:10:00 NaN
2017-11-04 11:48:00 NaN
2017-11-04 12:58:00 500.0
2017-11-04 13:40:00 NaN
2017-11-04 15:15:00 NaN
2017-11-04 16:21:00 NaN
2017-11-04 17:37:00 500.0
2017-11-04 21:37:00 NaN
2017-11-05 03:00:00 NaN
2017-11-05 06:30:00 NaN
2017-11-05 07:19:00 NaN
2017-11-05 08:31:00 200.0
2017-11-05 09:31:00 500.0
2017-11-05 12:03:00 NaN
2017-11-05 12:25:00 200.0
2017-11-05 13:11:00 500.0
2017-11-05 16:31:00 NaN
2017-11-05 19:00:00 500.0
2017-11-06 08:08:00 NaN

我有以下代码:

# load packages
import pandas as pd
import matplotlib.pyplot as plt

# import painkiller data
df = pd.read_csv('/Users/user/Documents/health/PainOverTime.csv',delimiter=',')

# plot bar graph of date and painkiller amount
times = pd.to_datetime(df.loc[:,'Time'])

ts = pd.Series(df.loc[:,'acetaminophen'].values, index = times,
name = 'Painkiller over Time')
ts.plot()

这给了我以下线(?)图:

line graph of raw data

这是一个开始;现在我想按日期汇总剂量。但是,此代码无法实现任何更改:生成的图是相同的。怎么了?

ts.resample('D',closed='left', label='right').sum()
ts.plot()

我还尝试过ts.resample('D').sum()ts.resample('1d').sum() ts.resample('1D').sum(),但绘图没有变化。

.resample 是正确的函数吗?我理解重采样是从数据中采样,例如每天随机取一个点,而我想对每天的值进行求和。

也就是说,我希望得到一些结果(基于上述数据),例如:

2017-11-03 800
2017-11-04 1600
2017-11-05 1900
2017-11-06 NaN

最佳答案

使用 pandas groupby 函数。

import io
import pandas as pd

data = io.StringIO('''
2017-11-03 07:30:00,NaN
2017-11-03 09:18:00,NaN
2017-11-03 10:00:00,NaN
2017-11-03 11:08:00,NaN
2017-11-03 14:39:00,NaN
2017-11-03 14:53:00,NaN
2017-11-03 15:00:00,NaN
2017-11-03 16:00:00,NaN
2017-11-03 17:03:00,NaN
2017-11-03 17:42:00,800.0
2017-11-04 07:27:00,600.0
2017-11-04 10:10:00,NaN
2017-11-04 11:48:00,NaN
2017-11-04 12:58:00,500.0
2017-11-04 13:40:00,NaN
2017-11-04 15:15:00,NaN
2017-11-04 16:21:00,NaN
2017-11-04 17:37:00,500.0
2017-11-04 21:37:00,NaN
2017-11-05 03:00:00,NaN
2017-11-05 06:30:00,NaN
2017-11-05 07:19:00,NaN
2017-11-05 08:31:00,200.0
2017-11-05 09:31:00,500.0
2017-11-05 12:03:00,NaN
2017-11-05 12:25:00,200.0
2017-11-05 13:11:00,500.0
2017-11-05 16:31:00,NaN
2017-11-05 19:00:00,500.0
2017-11-06 08:08:00,NaN
''')
column_names = ['date', 'val']
df = pd.read_csv(data, sep=',', header = None, names = column_names)
df['date'] = pd.to_datetime(df['date'])
df = df.groupby(df['date'].dt.date)[['val']].sum()
df.plot()

关于python - 如何在Python中按天对时间序列数据求和? resample.sum() 没有效果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50983386/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com