gpt4 book ai didi

python - 在多个条件下合并来自多个数据框的数据

转载 作者:太空宇宙 更新时间:2023-11-04 00:24:58 24 4
gpt4 key购买 nike

我想合并多个数据框,但前提是键匹配并且日期范围在 df1 中“InitialAdmit”日期范围的 90 天内。我想保留 df1 中的所有行,只合并 df2、df3 等中的其他行,只要它们与键匹配并在日期范围内即可。

注意:先合并 dfs 然后考虑日期范围条件对我不起作用。我首先使用了这种方法,但在很多情况下合并成功,但由于日期范围超出限制,我让脚本删除了该行。我需要以某种方式保留 df1 中的所有行。

Python Pandas: Merging data frames on multiple conditions - 这个问题很相似,但它似乎合并然后应用条件。我认为更好的方法是应用条件,然后在满足条件时合并。不过,我愿意接受建议。

数据框:

a = {'Key': [100000204, 100000255, 100000271,100000286,100000628], 
'InitialAdmit': ['2012-06-04', '2012-05-03', '2012-01-16', '2012-10-26', '2012-02-21'],
'90DayRange': ['2012-09-02', '2012-08-01', '2012-04-15', '2013-01-24', '2012-05-21']
}
df1 = pandas.DataFrame(data=a)
df1


b = {'Key': [100000208, 100000255, 100000723,100000286,100000866],
'InitialAdmit': ['2012-01-22', '2012-06-03', '2012-10-26', '2012-11-26', '2012-05-11'],
}
df2 = pandas.DataFrame(data=b)
df2

c = {'Key': [100000255, 100000255, 100000702,100000221,100000628],
'InitialAdmit': ['2012-06-22', '2012-10-03', '2012-10-26', '2012-11-26', '2012-04-11'],
}
df3 = pandas.DataFrame(data=c)
df3

脚本:

df_NotIncludedRows = pandas.DataFrame()
df_final = pandas.DataFrame()
dfs = [df2] #I plan to add more dataframes so I'm iterating through this list of dfs

for df in dfs: #iterate through each df in dfs
for key in df1["Key"]: #iterate through each key found in column 'Id'
if key in df["Key"]: # find any matching key from df1 in df2 (part of my issue exists here)
if (df["Admit"] >= df1["InitialAdmit"]) | (df["Admit"] <= df1["90DayRange"]):
df_final = pandas.merge(df1,df.loc[:],on='Key',how='left') # my df.loc[:] is a little off i think
else:
df_NotIncludedRows = df_NotIncludedRows.append(df.loc[:]) # same df.loc[:] issue i believe
df_NotIncludedRows

最佳答案

我还是推荐merge then filter,这里我们使用 bool 索引和combine_first

df=df1.merge(df2,on='Key')
m=(df.InitialAdmit_y>=df.InitialAdmit_x)&(df.InitialAdmit_y<=df.InitialAdmit_x)
df1.set_index('Key').combine_first(df[m].set_index('Key'))


Out[215]:
90DayRange InitialAdmit InitialAdmit_x InitialAdmit_y
Key
100000204 2012-09-02 2012-06-04 NaT NaT
100000255 2012-08-01 2012-05-03 2012-05-03 2012-06-03
100000271 2012-04-15 2012-01-16 NaT NaT
100000286 2013-01-24 2012-10-26 2012-10-26 2012-11-26
100000628 2012-05-21 2012-02-21 NaT NaT

关于python - 在多个条件下合并来自多个数据框的数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47842009/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com