gpt4 book ai didi

python - 合并数据框时用零填充缺失日期的数据

转载 作者:行者123 更新时间:2023-11-28 21:35:36 24 4
gpt4 key购买 nike

import pandas as pd
import numpy as np

one = pd.read_csv('data1.csv')
two = pd.read_csv('data2.csv')

我这样写代码,有一个显示

A    Date
10 2011-01-03
20 2011-01-04
10 2011-01-06
20 2011-01-07
30 2011-01-10
40 2011-01-13
25 2011-01-15



两场演出<​​/p>

B    Date
15 2011-01-01
15 2011-01-02
15 2011-01-03
25 2011-01-07
35 2011-01-10
10 2011-01-13
25 2011-01-15



当数据框被标记时,我想将 0 用于丢失日期的数据。现在我写代码

one_and_two = pd.merge(one, two, on='Date', how='inner')
print(one_and_two)

然后跑,one_and_two 是

    A        Date    B
0 10 2011-01-03 15
1 20 2011-01-07 25
2 30 2011-01-10 35
3 40 2011-01-13 10
4 25 2011-01-15 25



理想的输出是

    A        Date    B
0 0 2011-01-01 15
1 0 2011-01-02 15
2 10 2011-01-03 15
3 20 2011-01-04 0
4 0 2011-01-05 0
5 10 2011-01-06 0
6 20 2011-01-07 25
7 0 2011-01-08 0
8 0 2011-01-09 0
9 30 2011-01-10 35



Dataframe 有 2011-01-01 〜2011-12-31 ,我想把 0 放在丢失日期的数据上,但是我该怎么做?我的代码有什么问题?

最佳答案

将外连接与 reindex 结合使用按定义的日期范围:

df = (pd.merge(one, two, on='Date', how='outer')
.fillna(0)
.sort_values('Date')
.set_index('Date'))

df = (df.reindex(pd.date_range('2011-01-01', '2011-12-31'), name='Date'), fill_value=0)
.reset_index()
.reindex(columns=['A','Date','B']))

或按最小和最大日期:

df = (df.reindex(pd.date_range(df.index.min(), df.index.max(), name='Date'), fill_value=0)
.reset_index()
.reindex(columns=['A','Date','B']))
print (df)
A Date B
0 0.0 2011-01-01 15.0
1 0.0 2011-01-02 15.0
2 10.0 2011-01-03 15.0
3 20.0 2011-01-04 0.0
4 0.0 2011-01-05 0.0
5 10.0 2011-01-06 0.0
6 20.0 2011-01-07 25.0
7 0.0 2011-01-08 0.0
8 0.0 2011-01-09 0.0
9 30.0 2011-01-10 35.0
10 0.0 2011-01-11 0.0
11 0.0 2011-01-12 0.0
12 40.0 2011-01-13 10.0
13 0.0 2011-01-14 0.0
14 25.0 2011-01-15 25.0

关于python - 合并数据框时用零填充缺失日期的数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52071482/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com