gpt4 book ai didi

python - 合并两个具有重复日期时间索引的 Pandas 系列

转载 作者:太空宇宙 更新时间:2023-11-03 14:54:40 25 4
gpt4 key购买 nike

我有两个按日期时间索引的 Pandas 系列(d1 和 d2),每个系列都包含一列包含 float 和 NaN 的数据。尽管时间条目与许多缺失天数的时期不一致,但这两个指数均以一天为间隔。 d1 的范围为 1974-12-16 至 2002-01-30。 d2 范围为 1997-12-19 至 2017-07-06。从1997年12月19日到2002年1月30日期间,两个系列之间存在许多重复的索引。重复索引的数据有时是相同的值、不同的值或一个值和 NaN。

我想将这两个系列合并为一个,在存在重复索引时优先考虑 d2 中的数据(即,在存在重复索引时将所有 d1 数据替换为 d2 数据)。在众多可用的 Pandas 工具(合并、联接、连接等)中,最有效的方法是什么?

这是我的数据示例:

In [7]: print d1
fldDate
1974-12-16 19.0
1974-12-17 28.0
1974-12-18 24.0
1974-12-19 18.0
1974-12-20 17.0
1974-12-21 28.0
1974-12-22 28.0
1974-12-23 10.0
1974-12-24 6.0
1974-12-25 5.0
1974-12-26 12.0
1974-12-27 19.0
1974-12-28 22.0
1974-12-29 20.0
1974-12-30 16.0
1974-12-31 12.0
1975-01-01 12.0
1975-01-02 15.0
1975-01-03 14.0
1975-01-04 15.0
1975-01-05 18.0
1975-01-06 21.0
1975-01-07 22.0
1975-01-08 18.0
1975-01-09 20.0
1975-01-10 12.0
1975-01-11 8.0
1975-01-12 -2.0
1975-01-13 13.0
1975-01-14 24.0
...
2002-01-01 18.0
2002-01-02 16.0
2002-01-03 NaN
2002-01-04 24.0
2002-01-05 23.0
2002-01-06 15.0
2002-01-07 22.0
2002-01-08 34.0
2002-01-09 35.0
2002-01-10 29.0
2002-01-11 21.0
2002-01-12 24.0
2002-01-13 NaN
2002-01-14 18.0
2002-01-15 14.0
2002-01-16 10.0
2002-01-17 5.0
2002-01-18 7.0
2002-01-19 7.0
2002-01-20 7.0
2002-01-21 11.0
2002-01-22 NaN
2002-01-23 9.0
2002-01-24 8.0
2002-01-25 15.0
2002-01-26 NaN
2002-01-27 NaN
2002-01-28 18.0
2002-01-29 13.0
2002-01-30 13.0
Name: MaxTempMid, dtype: float64

In [8]: print d2
fldDate
1997-12-19 22.0
1997-12-20 14.0
1997-12-21 18.0
1997-12-22 16.0
1997-12-23 16.0
1997-12-24 10.0
1997-12-25 12.0
1997-12-26 12.0
1997-12-27 9.0
1997-12-28 12.0
1997-12-29 18.0
1997-12-30 23.0
1997-12-31 28.0
1998-01-01 26.0
1998-01-02 29.0
1998-01-03 27.0
1998-01-04 22.0
1998-01-05 19.0
1998-01-06 17.0
1998-01-07 14.0
1998-01-08 14.0
1998-01-09 14.0
1998-01-10 16.0
1998-01-11 20.0
1998-01-12 21.0
1998-01-13 19.0
1998-01-14 20.0
1998-01-15 16.0
1998-01-16 17.0
1998-01-17 20.0
...
2017-06-07 68.0
2017-06-08 71.0
2017-06-09 71.0
2017-06-10 59.0
2017-06-11 41.0
2017-06-12 57.0
2017-06-13 58.0
2017-06-14 36.0
2017-06-15 50.0
2017-06-16 58.0
2017-06-17 54.0
2017-06-18 53.0
2017-06-19 58.0
2017-06-20 68.0
2017-06-21 71.0
2017-06-22 71.0
2017-06-23 59.0
2017-06-24 61.0
2017-06-25 65.0
2017-06-26 68.0
2017-06-27 71.0
2017-06-28 60.0
2017-06-29 54.0
2017-06-30 48.0
2017-07-01 60.0
2017-07-02 68.0
2017-07-03 65.0
2017-07-04 73.0
2017-07-05 74.0
2017-07-06 77.0
Name: MaxTempMid, dtype: float64

最佳答案

让我们使用,combine_first :

df2.combine_first(df1)

输出:

fldDate
1974-12-16 19.0
1974-12-17 28.0
1974-12-18 24.0
1974-12-19 18.0
1974-12-20 17.0
1974-12-21 28.0
1974-12-22 28.0
1974-12-23 10.0
1974-12-24 6.0
1974-12-25 5.0
1974-12-26 12.0
1974-12-27 19.0
1974-12-28 22.0
1974-12-29 20.0
1974-12-30 16.0
1974-12-31 12.0
1975-01-01 12.0
1975-01-02 15.0
1975-01-03 14.0
1975-01-04 15.0
1975-01-05 18.0
1975-01-06 21.0
1975-01-07 22.0
1975-01-08 18.0
1975-01-09 20.0
1975-01-10 12.0
1975-01-11 8.0
1975-01-12 -2.0
1975-01-13 13.0
1975-01-14 24.0
...
2017-06-07 68.0
2017-06-08 71.0
2017-06-09 71.0
2017-06-10 59.0
2017-06-11 41.0
2017-06-12 57.0
2017-06-13 58.0
2017-06-14 36.0
2017-06-15 50.0
2017-06-16 58.0
2017-06-17 54.0
2017-06-18 53.0
2017-06-19 58.0
2017-06-20 68.0
2017-06-21 71.0
2017-06-22 71.0
2017-06-23 59.0
2017-06-24 61.0
2017-06-25 65.0
2017-06-26 68.0
2017-06-27 71.0
2017-06-28 60.0
2017-06-29 54.0
2017-06-30 48.0
2017-07-01 60.0
2017-07-02 68.0
2017-07-03 65.0
2017-07-04 73.0
2017-07-05 74.0
2017-07-06 77.0

关于python - 合并两个具有重复日期时间索引的 Pandas 系列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45677089/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com