gpt4 book ai didi

python - 识别 Pandas DataFrame 中的前导和尾随 NA

转载 作者:行者123 更新时间:2023-12-04 18:55:09 26 4
gpt4 key购买 nike

有没有办法在 pandas.DataFrame 中识别前导和尾随 NA

目前我执行以下操作,但似乎并不简单:

import pandas as pd
df = pd.DataFrame(dict(a=[0.1, 0.2, 0.2],
b=[None, 0.1, None],
c=[0.1, None, 0.1])
lead_na = (df.isnull() == False).cumsum() == 0
trail_na = (df.iloc[::-1].isnull() == False).cumsum().iloc[::-1] == 0
trail_lead_nas = top_na | trail_na

任何想法如何更有效地表达?

回答:
%timeit df.ffill().isna() | df.bfill().isna()
The slowest run took 29.24 times longer than the fastest. This could mean that
an intermediate result is being cached.
31 ms ± 25.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit ((df.isnull() == False).cumsum() == 0) | ((df.iloc[::-1].isnull() ==False).cumsum().iloc[::-1] == 0)
255 ms ± 66.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

最佳答案

这个怎么样

df.ffill().isna() | df.bfill().isna()

Out[769]:
a b c
0 False True False
1 False False False
2 False True False
df = pd.concat([df] * 1000, ignore_index=True)

In [134]: %%timeit
...: lead_na = (df.isnull() == False).cumsum() == 0
...: trail_na = (df.iloc[::-1].isnull() == False).cumsum().iloc[::-1] == 0
...: trail_lead_nas = lead_na | trail_na
...:
11.8 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [135]: %%timeit
...: df.ffill().isna() | df.bfill().isna()
...:
2.1 ms ± 50 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

关于python - 识别 Pandas DataFrame 中的前导和尾随 NA,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59820159/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com