gpt4 book ai didi

python - 如何使用 Pandas 仅用空字符串替换无?

转载 作者:太空狗 更新时间:2023-10-30 00:31:46 28 4
gpt4 key购买 nike

下面的代码生成一个df:

import pandas as pd
from datetime import datetime as dt
import numpy as np

dates = [dt(2014, 1, 2, 2), dt(2014, 1, 2, 3), dt(2014, 1, 2, 4), None]
strings1 = ['A', 'B',None, 'C']
strings2 = [None, 'B','C', 'C']
strings3 = ['A', 'B','C', None]
vals = [1.,2.,np.nan, 4.]
df = pd.DataFrame(dict(zip(['A','B','C','D','E'],
[strings1, dates, strings2, strings3, vals])))



+---+------+---------------------+------+------+-----+
| | A | B | C | D | E |
+---+------+---------------------+------+------+-----+
| 0 | A | 2014-01-02 02:00:00 | None | A | 1 |
| 1 | B | 2014-01-02 03:00:00 | B | B | 2 |
| 2 | None | 2014-01-02 04:00:00 | C | C | NaN |
| 3 | C | NaT | C | None | 4 |
+---+------+---------------------+------+------+-----+

我想用 ''(空字符串)替换里面的所有 None(python 中真正的 None,而不是 str)。

预期的 df

+---+---+---------------------+---+---+-----+
| | A | B | C | D | E |
+---+---+---------------------+---+---+-----+
| 0 | A | 2014-01-02 02:00:00 | | A | 1 |
| 1 | B | 2014-01-02 03:00:00 | B | B | 2 |
| 2 | | 2014-01-02 04:00:00 | C | C | NaN |
| 3 | C | NaT | C | | 4 |
+---+---+---------------------+---+---+-----+

我做的是

df = df.replace([None], [''], regex=True)

但是我得到了

+---+---+---------------------+---+------+---+
| | A | B | C | D | E |
+---+---+---------------------+---+------+---+
| 0 | A | 1388628000000000000 | | A | 1 |
| 1 | B | 1388631600000000000 | B | B | 2 |
| 2 | | 1388635200000000000 | C | C | |
| 3 | C | | C | | 4 |
+---+---+---------------------+---+------+---+

  1. 所有的日期都变成了大数字
  2. 甚至 NaTNaN 都被替换了,这是我不想要的。

我怎样才能正确有效地实现这一点?

最佳答案

这就足够了

df.fillna("",inplace=True)
df
Out[142]:
A B C D E
0 A 2014-01-02 02:00:00 A 1
1 B 2014-01-02 03:00:00 B B 2
2 2014-01-02 04:00:00 C C
3 C C 4

编辑 2021-07-26 根据@dWitty 的评论完成回复

如果你真的想保留 Nat 和 NaN 值而不是文本,你只需要为你的文本列填充 Na在您的示例中,这是 A、C、D

您只需为您的列发送替换 的字典。每列的值可以不同。对于您的情况,您只需要构造字典

# default values to replace NA (None)
# values = {"A": "", "C": "", "D": ""}
values = (dict([[e,""] for e in ['A','C','D']]))
df.fillna(value=values, inplace=True)
df
Out[142]:
A B C D E
0 A 2014-01-02 02:00:00 A 1.0
1 B 2014-01-02 03:00:00 B B 2.0
2 2014-01-02 04:00:00 C C NaN
3 C NaT C 4.0

关于python - 如何使用 Pandas 仅用空字符串替换无?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31295740/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com