gpt4 book ai didi

python - 多个数据帧的 Pandas 连接返回空值

转载 作者:行者123 更新时间:2023-12-02 04:38:51 24 4
gpt4 key购买 nike

我有一个数据框 (df),我将其分解为 4 个新的 df(mediaclientcode_type,和 date)。 media 有一列空值,而其他三列只有 1-dim dfs,每个都由空值组成。替换每个数据帧中的空值后,我尝试 pd.concat 获取单个 df 并获得以下结果。

 code_type
0 P
1 P
2 P
3 P
4 P
5 P

code_name media_type acq. revenue
0 RASH NaN 50.0 34004.0
1 100 NaN 10.0 1035.0
2 NEWS NaN 61.0 3475.0
3 DR NaN 53.0 4307.0
4 SPORTS NaN 45.0 6503.0
5 DOUBL NaN 13.0 4205.0

client_id
0 2.0
1 2.0
2 2.0
3 2.0
4 2.0
5 2.0

date
0 2016-08-15
1 2016-08-15
2 2016-08-15
3 2016-08-15
4 2016-08-15
5 2016-08-15

我将 pd.merge media 与另一个单独的 df 替换 media.media_type 下的 NaN,它附加了一个新的 media_type_y

code_name   media_type_x    acq.    revenue  media_type_y
0 RASH NaN 282 34004.0 Radio
1 100 NaN 119 1035.0 NaN
2 NEWS NaN 81 3475.0 SiriusXM
3 DR NaN 33 4307.0 SiriusXM
4 SPORTS NaN 25 6503.0 SiriusXM
5 DOUBL NaN 23 4205.0 Podcast

然后我删除 media_type_x 并将 media_type_y 重命名为 media_type

final = m.loc[:,('code_name','media_type_y', 'acquisition', 'revenue')]
final = final.rename(columns={'media_type_y': 'media_type'})

所以当我连接时,我有一个完整的 df。

clean = pd.concat([media, client, code_type, date], axis=1)  

code media acq. revenue client code_type date
0 RASH Radio 50.0 34004.0 NaN NaN NaT
1 100 NaN 10.0 1035.0 NaN NaN NaT
2 NEWS SiriusXM 61.0 3475.0 NaN NaN NaT
3 DR SiriusXM 53.0 4307.0 NaN NaN NaT
4 SPORTS SiriusXM 45.0 6503.0 NaN NaN NaT
5 DOUBL Podcast 13.0 4205.0 NaN NaN NaT


clean.client 应该都是 2
clean.code_type应该都是P
clean.date 应该都是 08/15/2016

dfs 本身显示数据,只有当我连接时我才会丢失信息。我认为这可能与索引有关,但我不确定。也可能与我有一个同时包含 strint 的列(参见上面的 clean.code)有关,这可能是为什么我会收到下面列出的运行时错误。

//anaconda/lib/python3.5/site-packages/pandas/indexes/api.py:71: RuntimeWarning: unorderable types: int() < str(), sort order is undefined for incomparable objects result = result.union(other)

最佳答案

从这里开始:

  code_name media_type  acq.  revenue
0 RASH Radio 50.0 34004.0
1 100 NaN 10.0 1035.0
2 NEWS SiriusXM 61.0 3475.0
3 DR SiriusXM 53.0 4307.0
4 SPORTS SiriusXM 45.0 6503.0
5 DOUBL Podcast 13.0 4205.0

试试这个:

df['client_id'] = 2
df['date'] = '08/15/2016'
df['code_type'] = 'P'
df

code_name media_type acq. revenue client_id date code_type
0 RASH Radio 50.0 34004.0 2 08/15/2016 P
1 100 NaN 10.0 1035.0 2 08/15/2016 P
2 NEWS SiriusXM 61.0 3475.0 2 08/15/2016 P
3 DR SiriusXM 53.0 4307.0 2 08/15/2016 P
4 SPORTS SiriusXM 45.0 6503.0 2 08/15/2016 P
5 DOUBL Podcast 13.0 4205.0 2 08/15/2016 P

关于python - 多个数据帧的 Pandas 连接返回空值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39172575/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com