gpt4 book ai didi

python - 如何处理具有 NaN 的 Pandas Series 数据类型?

转载 作者:行者123 更新时间:2023-11-28 21:08:51 24 4
gpt4 key购买 nike

在包含 NaN 的 pandas.core.series.Series 类型上使用 max() 和 min() 时会发生什么?这是一个错误吗?见下文,


%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

mydata = pd.DataFrame(np.random.standard_normal((100,1)), columns=['No NaN'])
mydata['Has NaN'] = mydata['No NaN'] / mydata['No NaN'].shift(1)

# Both return NaN!
print(min(mydata['Has NaN']), max(mydata['Has NaN']))
# Still why False? Isn't float('nan') a singleton like None?
print(min(mydata['Has NaN']) == max(mydata['Has NaN']))
# But this time works well!
print(min([1, 2, 3, float('nan')]))

print('\n')

# When Series data type that has NaN bumps into min() and max(), what should
# I do? E.g.,
try:
n, bins, patches = plt.hist(mydata['Has NaN'], 10)
except ValueError as e:
print(e, '\nSeems "range" argument in hist() has problem!')

最佳答案

首先,你不应该使用 Python 内置的 maxmin在处理 pandas 时或 numpy , 特别是当你使用 nan 时.

因为 'nan' 是 mydata['Has NaN'] 的第一项, 它永远不会在 max 中被替换或 min因为(如 docs 中所述):

The not-a-number values float('NaN') and Decimal('NaN') are special. They are identical to themselves (x is x is true) but are not equal to themselves (x == x is false). Additionally, comparing any number to a not-a-number value will return False. For example, both 3 < float('NaN') and float('NaN') < 3 will return False.

相反,使用 pandas maxmin方法:

In [4]: mydata['Has NaN'].min()
Out[4]: -176.9844930355774

In [5]: mydata['Has NaN'].max()
Out[5]: 12.684033138603787

关于直方图,这似乎是 plt.hist 的一个已知问题, 请参阅 herehere .

不过,现在处理起来应该相当简单:

n, bins, patches = plt.hist(mydata['Has NaN'][~mydata['Has NaN'].isnull()], 10)

enter image description here

关于python - 如何处理具有 NaN 的 Pandas Series 数据类型?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39304173/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com