gpt4 book ai didi

python - 如何将 NaN >> ['' ] 转换为 Pandas Dataframe 的所有元素?

转载 作者:行者123 更新时间:2023-12-01 01:48:36 31 4
gpt4 key购买 nike

import pandas as pd
import numpy as np

df = pd.DataFrame({
'A': [[1, 2, 3, 4], [4, 5, 6, 7, 8], [7, 6, 4], np.nan, [1, 2]],
'B': [[1, 2, 3, 4], [4, 5, 6, 7, 8], [3, 7, 9], np.nan, [4, 5]],
'E': [np.nan, np.nan, np.nan, np.nan, np.nan],
'F': [[2, 2], [4, 4], np.nan, [78, 90], np.nan]
})

# First try
# ERROR: Cannot do inplace boolean setting on mixed-types with a non np.nan value
# df[df.isnull()] = df[df.isnull()].applymap(lambda x: [''])

# Second try
# ERROR: Invalid "to_replace" type: 'float'
# df.replace(to_replace=np.nan, value=[''], inplace=True)

# Third try
# RESULT: The column 'E' dissapears and the rest of NaN values are converted to None
# stack = df.stack()
# stack[stack.isnull()] = [''] # or stack[stack == np.nan] = ['']
# stack.unstack()

# Fourth try
# ERROR: "value" parameter must be a scalar or dict, but you passed a "list"
# df.fillna([''])

这是我的预期结果:

df = pd.DataFrame({
'A': [[1, 2, 3, 4], [4, 5, 6, 7, 8], [7, 6, 4], [''], [1, 2]],
'B': [[1, 2, 3, 4], [4, 5, 6, 7, 8], [3, 7, 9], [''], [4, 5]],
'E': [[''], [''], [''], [''], ['']],
'F': [[2, 2], [4, 4], [''], [78, 90], ['']]
})

我已经尝试了示例中显示的所有方法,但没有结果。如何实现这一目标?

注意:我想指出替换的是一个只有一个元素(空字符串)的列表。另外,它可能是[np.nan]

最佳答案

更新:

In [136]: df.applymap(lambda x: x if isinstance(x, list) else [])
Out[136]:
A B E F
0 [1, 2, 3, 4] [1, 2, 3, 4] [] [2, 2]
1 [4, 5, 6, 7, 8] [4, 5, 6, 7, 8] [] [4, 4]
2 [7, 6, 4] [3, 7, 9] [] []
3 [] [] [] [78, 90]
4 [1, 2] [4, 5] [] []

或者:

In [152]: df = df.applymap(lambda x: x if isinstance(x, list) else [np.nan])

In [153]: df
Out[153]:
A B E F
0 [1, 2, 3, 4] [1, 2, 3, 4] [nan] [2, 2]
1 [4, 5, 6, 7, 8] [4, 5, 6, 7, 8] [nan] [4, 4]
2 [7, 6, 4] [3, 7, 9] [nan] [nan]
3 [nan] [nan] [nan] [78, 90]
4 [1, 2] [4, 5] [nan] [nan]

注意:请关注@jpp's comment - 在单元格中存储非标量值会破坏 Pandas/Numpy 的 90% 魔力,因为大多数快速内部矢量化方法都期望单元格中存在标量值 - 它们将无法工作或无法按预期工作。

<小时/>

问题更新之前的数据集答案:

你能做到:

In [120]: df = df.fillna('')

In [121]: df
Out[121]:
A B C D E F
0 zero one 0.226100 1.764036 2
1 one one -1.672476 -0.867188 2
2 two 0.671258 0.125589 4
3 three three 1.135731 0.080577 4
4 four two -1.711692 0.735028 67
5 two 0.608488 1.012977
6 six one -1.233979 -0.623781 78
7 seven three 0.256893 -0.546639 90

但所有列至少包含一个 NaN value 将被转换为字符串,因为空字符串 ''将始终有一个字符串 ( object ) dtype :

In [122]: df.dtypes
Out[122]:
A object
B object
C float64
D float64
E object
F object
dtype: object

关于python - 如何将 NaN >> ['' ] 转换为 Pandas Dataframe 的所有元素?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50966435/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com